rmlint finds duplicate files on your filesystem, identifying identical and similar files that waste disk space. Scans directories recursively and builds a database of file checksums for quick duplicate detection.
rmlint is an open source command line tool that searches for duplicate files on Linux and Unix-like systems. It scans the filesystem, building up a database of file checksums, sizes, and other metadata in order to quickly identify duplicate and similar files that are wasting disk space.
One of rmlint's standout features is its ability to find partial and fuzzy duplicates - files that have overlapping content but are not identical byte-for-byte. This allows it to identify wasted space even for files like documents, log files, backups, and more where some contents change over time while other parts stay the same.
Beyond just finding duplicates, rmlint offers a comprehensive set of options for analyzing duplicates and reclaiming wasted space. You can dig into the duplicate sets to see how they differ, preview deletions before committing them, and select exactly which files should be deleted or hard-linked together. It also integrates with compression utilities for gzipping duplicate files.
In summary, rmlint focuses specifically on the problem of wasted disk space from duplicate files. With flexible duplicate detection and review options, it helps you reclaim gigabytes or even terabytes of unnecessary duplicated data.
Here are some alternatives to Rmlint:
Suggest an alternative ❐