Topic: Finding Duplicate Files and Keyword Searching Large Number of Files @ AskWoody

Finding Duplicate Files and Keyword Searching Large Number of Files
Home » Forums » AskWoody support » Windows » Windows – other » Finding Duplicate Files and Keyword Searching Large Number of Files
- This topic has 3 replies, 3 voices, and was last updated 9 years, 10 months ago.
Author

Topic
New Reply

WSZeno
AskWoody Lounger

July 6, 2015 at 7:22 am #500794

I’ve got 60,000 files that I need to eliminate duplicates. They are mainly TIFF and PDF files. The problem is they are sequentially numbered so I cannot use the file name to find duplications.

I have Adobe Acrobat Professional so I can use OCR to search inside the file for content. But, opening 60k files manually is not practical.

Ideally, I’m looking for two solutions for automating the processes of (1) identifying duplicate files and (2) searching the file content for keywords. Does anyone have any suggestions?

Reply | Quote

Viewing 2 reply threads
Author

Replies
- access-mdb
  AskWoody MVP
  
  July 6, 2015 at 7:28 am #1513836
  
  Easy Duplicate Finder seems to work, it didn’t take long to find duplicates in my 23k or so of picture files.
  
  Eliminate spare time: start programming PowerShell
  
  Reply | Quote
- WSZeno
  AskWoody Lounger
  
  July 6, 2015 at 4:11 pm #1513954
  
  Thanx to access-mdb for addressing the duplication issue.
  
  Regarding the second part of the question, I’ve got all 60k file names in an Excel file with hyperlinks to all of the files. Is it possible to code (VBA, scripting, etc) it to open each file individually in Acrobat, run the OCR conversion in Acrobat, then save the converted file? I believe I could then use the Windows Explorer search tool to find words inside the files.
  
  Reply | Quote
- RetiredGeek
  AskWoody_MVP
  
  July 6, 2015 at 4:14 pm #1513955
  
  Zeno,
  
  If you want to search your files using Windows Search you need the Adobe iFilter. HTH :cheers:
  
  May the Forces of good computing be with you!
  
  RG
  
  PowerShell & VBA Rule!
  Computer Specs
  
  Reply | Quote
Viewing 2 reply threads

Reply To: Finding Duplicate Files and Keyword Searching Large Number of Files
You can use BBCodes to format your content.
Your account can't use all available BBCodes, they will be stripped before saving.

Your information:
Name (required):

Mail (will not be published) (required):

Website:

Cancel

Plus Membership

Donations from Plus members keep this site going. You can identify the people who support AskWoody by the Plus badge on their avatars.

AskWoody Plus members not only get access to all of the contents of this site -- including Susan Bradley's frequently updated Patch Watch listing -- they also receive weekly AskWoody Plus Newsletters (formerly Windows Secrets Newsletter) and AskWoody Plus Alerts, emails when there are important breaking developments.

Welcome to our unique respite from the madness.

It's easy to post questions about Windows 11, Windows 10, Win8.1, Win7, Surface, Office, or browse through our Forums. Post anonymously or register for greater privileges. Keep it civil, please: Decorous Lounge rules strictly enforced. Questions? Contact Customer Support.

S	M	T	W	T	F	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Finding Duplicate Files and Keyword Searching Large Number of Files

Plus Membership

Search Newsletters

Search Forums

View the Forum

Search for Topics

Recent Topics

Recent blog posts

My Profile

Key Links

Remembering Woody

Finding Duplicate Files and Keyword Searching Large Number of Files

Plus Membership

Search Newsletters

Search Forums

View the Forum

Search for Topics

Recent Topics

Recent blog posts

My Profile

Login and Registration

Key Links

Remembering Woody