• What files can I blame for hogging all my disk space?! (utility: btdu)

    From August Abolins@618:400/23.10 to All on Mon May 25 07:06:00 2026

    ==================================================================<
    ** Original area : "/grc/techtalk"
    ** Original message from : 0x6c976bb3@lookup.openpgp.key.invalid (Andrew Skretvedt)
    ** Original message to :
    ** Original date/time : 25 May 26, 02:43 >==================================================================<

    Last month, there was a nice thread started by Peabody asking a question
    about disk metadata.

    Message-ID: <10ru7ci$k8m$1@GRC>
    https://grc.com/groups/techtalk:300610

    Tonight, I just ran across a pretty awesome looking space-usage analysis
    tool for folks running one of the more advanced filesystems with
    features like Copy-on-Write reflinks, snapshots, embedded data
    compression, and the like.

    These advanced features tend to make the actual storage required much
    less than the apparent size of a file set.

    This tends to make the process of accounting for space usage on such a filesystem a little harder and more abstract.

    The tool is 'btdu', for Linux. It sports a TUI for terminal awesomeness.
    It also has a straight cli-mode.

    https://github.com/CyberShadow/btdu

    It's most unique property is that it is a /sampling/ disk usage
    analyzer. Rather than starting at the root directory or some other
    directory and recursively enumerating all the objects therein to
    generate space usage stats, it treats the filesystem rather like a dart
    board and starts throwing darts into the on-disk address space. As they explain in their readme, for each dart location, they ask the filesystem
    what file(s) point to the data there; then they collect metadata about
    those files.

    Like sampling a population, sampling a filesystem in this way builds an estimate of the usage state of the disk, which improves in its "margin
    of error" rapidly as the number of samples builds. They say that as
    little as 100 samples can estimate the space usage of a filesystem to
    ~1%. So you can get an idea of what's hogging all your space /very/ quickly.

    (if you think about it, if you have space hogging files, one of those
    initial 100 darts are likely to land on a block connected with one of
    them, since their "target" size is bigger; their data is more likely to
    be the earliest accounted for in a sampling run)

    Typically, the sampling proceeds until you stop it. So the estimate
    starts inaccurate, improves to reasonable in a short time, and after
    this gradually converges toward an /exact/ report once /every/ block of
    the filesystem has been sampled.

    If you run Linux and use BTRFS filesystems especially, then this is
    worth a look.

    (I haven't studied-up yet enough to know if it can also work with ZFS,
    XFS, and other filesystems with similar advanced features. Nor if you
    could use it on BSD or macOS.)

    My Mint 21.3 and 22.2 systems don't have this tool in their default
    package manager; the tool seems relatively new (still showing version
    0.x.y). So, I intend to setup the required build dependencies (written
    in D) and build a copy from source.

    I'll report back with some experiences once I have a binary to try out!


    --
    OpenPGP 0xC6901B2A6C976BB3 (https://keys.openpgp.org)



    --- OpenXP 5.0.64
    * Origin: (618:400/23.10)