I found the recent article in Issue 17.19.0 – 2020-05-18 interesting in that it discusses several duplicate file finder tools and as Google discovers there are many. The simplest used duplicate file names and file sizes for detection, as CCleaner does. The issue there might be how file size it determined and thankfully that’s easy enough, so the comparison should always be valid. Where file finders may differ is with duplicates with different names and identical content. One might think that identical content would be straightforward but it’s not so easy. How does one actually verify identical content without a byte-by-byte comparison? These duplicate finders use approximations to detect identical files rather than the tedious byte comparison. That’s where the programs may differ. Each calculates something, CRC, Checksum, hash-nn and the detection depends on the speed of that calculation. Unfortunately few of the tools tell the user what method they use.
I regularly use Clonemaster by SoftByteLabs and DupScoutPro from dupscout. Clonemaster claims byte-by-byte, but I doubt that it probably uses a large hash calculation. I have no idea what DupScout uses. Clonemaster seems quite accurate but is somewhat slow in operation. So far it has never failed to find exact duplicates that I’ve checked. DupScout claims multi-threaded operation and can scan network shares. It is fast and can be configured to operate in multiple ways. It offers multiple deletion options except using the recycle bin.
I can only wish for the tools to define how identicals are detected.
Edit to remove HTML. Please use the “Text” tab in the entry box when you copy/paste.