• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Searching in PDF?

ebolamonkey3

New Member
Joined
Apr 9, 2010
Messages
773 (0.15/day)
Location
Atlanta/Marietta, GA
System Name Norbert
Processor Intel Core i7 920
Motherboard Gigabyte X58A-UD5
Cooling Corsair H50 with 2x Scythe GT AP-14
Memory 3x 2gb G.Skill 1600Mhz C9 DDR3
Video Card(s) MSI Twin Frozr II GTX 465 GE & EVGA GTS 450 SC
Storage 2x 1Tb Samsung Sprinpoint F3 7200rpm
Display(s) Dell U3011, Dell 2408WFP, Samsung 2693HM
Case Lian Li V1020R
Audio Device(s) Creative X-Fi Titanium
Power Supply Seasonic X-750
Software Windows 7 Ultimate 64bit
Hey guys, I need to look up a large list of data from a PDF file, basically just to check if each entry of the list is in the pdf. Is there some way to do this without having to check one by one?
 
Joined
Jul 19, 2006
Messages
43,585 (6.74/day)
Processor AMD Ryzen 7 7800X3D
Motherboard ASUS TUF x670e
Cooling EK AIO 360. Phantek T30 fans.
Memory 32GB G.Skill 6000Mhz
Video Card(s) Asus RTX 4090
Storage WD m.2
Display(s) LG C2 Evo OLED 42"
Case Lian Li PC 011 Dynamic Evo
Audio Device(s) Topping E70 DAC, SMSL SP200 Headphone Amp.
Power Supply FSP Hydro Ti PRO 1000W
Mouse Razer Basilisk V3 Pro
Keyboard Tester84
Software Windows 11
Joined
Apr 10, 2006
Messages
373 (0.06/day)
Location
Arizona, USA
Processor Intel Core i5 3450
Motherboard Asus P8H77-M Pro
Cooling Thermalright HR-02 Macho
Memory 2x2GB G-Skill DDR3-2000
Video Card(s) Asus GTX660ti 2GB
Storage 128GB Samsung 830, 1TB Western Digital Caviar Black
Display(s) 23" Dell U2311H
Case Silverstone FT03 (titanium color)
Audio Device(s) Onboard Realtek ALC892
Power Supply SeaSonic X series SS-460FL
Software Windows 8 Professional 64-bit
That.
Or, if you need to check a bunch of text with some other advanced method, you could use the text selection tool in Adobe Reader, select the text, copy and paste it into some other application (like MS Word or Excel) that will allow you to search the way you want (with custom VBA macro code).
 

streetfighter 2

New Member
Joined
Jul 26, 2010
Messages
1,655 (0.33/day)
Location
Philly
I'd do a slight modification on what gvblake22 said.

First I'd copy the data out of the pdf with the text selection tool. Then I'd create a copy of the data you're looking for (called myData_test.txt) and paste the data from the pdf into it. Using some basic command line tools like this (where myData.txt is the data you're looking for):
Code:
sort myData_test.txt | uniq -d > matchingData.txt
sort myData.txt | diff matchingData.txt -

The output of the second command will only show the data that's missing from the pdf.
 
Joined
Apr 10, 2006
Messages
373 (0.06/day)
Location
Arizona, USA
Processor Intel Core i5 3450
Motherboard Asus P8H77-M Pro
Cooling Thermalright HR-02 Macho
Memory 2x2GB G-Skill DDR3-2000
Video Card(s) Asus GTX660ti 2GB
Storage 128GB Samsung 830, 1TB Western Digital Caviar Black
Display(s) 23" Dell U2311H
Case Silverstone FT03 (titanium color)
Audio Device(s) Onboard Realtek ALC892
Power Supply SeaSonic X series SS-460FL
Software Windows 8 Professional 64-bit
I'd do a slight modification on what gvblake22 said.

First I'd copy the data out of the pdf with the text selection tool. Then I'd create a copy of the data you're looking for (called myData_test.txt) and paste the data from the pdf into it. Using some basic command line tools like this (where myData.txt is the data you're looking for):
Code:
sort myData_test.txt | uniq -d > matchingData.txt
sort myData.txt | diff matchingData.txt -

The output of the second command will only show the data that's missing from the pdf.
That's a great idea. I'm assuming you just run that code as a batch file or in the Windows > Run > 'cmd'?
 

streetfighter 2

New Member
Joined
Jul 26, 2010
Messages
1,655 (0.33/day)
Location
Philly
That's a great idea. I'm assuming you just run that code as a batch file or in the Windows > Run > 'cmd'?
Those are actually GNU command line utilities (common to linux/unix). I run them in Windows using cygwin. They can be run without cygwin using the GNU Utilities for Windows (though I've never tried it). Theoretically you should be able to use the GNU Utilities for Windows just like native DOS commands (in batch scripts or directly in the command prompt).
 

ebolamonkey3

New Member
Joined
Apr 9, 2010
Messages
773 (0.15/day)
Location
Atlanta/Marietta, GA
System Name Norbert
Processor Intel Core i7 920
Motherboard Gigabyte X58A-UD5
Cooling Corsair H50 with 2x Scythe GT AP-14
Memory 3x 2gb G.Skill 1600Mhz C9 DDR3
Video Card(s) MSI Twin Frozr II GTX 465 GE & EVGA GTS 450 SC
Storage 2x 1Tb Samsung Sprinpoint F3 7200rpm
Display(s) Dell U3011, Dell 2408WFP, Samsung 2693HM
Case Lian Li V1020R
Audio Device(s) Creative X-Fi Titanium
Power Supply Seasonic X-750
Software Windows 7 Ultimate 64bit
Thanks for the response guys! I actually ended up finding a copy of the file in excel so all's good now :D
 
Top