I’ve got 60,000 files that I need to eliminate duplicates. They are mainly TIFF and PDF files. The problem is they are sequentially numbered so I cannot use the file name to find duplications.
I have Adobe Acrobat Professional so I can use OCR to search inside the file for content. But, opening 60k files manually is not practical.
Ideally, I’m looking for two solutions for automating the processes of (1) identifying duplicate files and (2) searching the file content for keywords. Does anyone have any suggestions?