• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Mail clients, with easy search inside attached files

Frick

Fishfaced Nincompoop
Joined
Feb 27, 2006
Messages
18,927 (2.86/day)
Location
Piteå
System Name Black MC in Tokyo
Processor Ryzen 5 5600
Motherboard Asrock B450M-HDV
Cooling Be Quiet! Pure Rock 2
Memory 2 x 16GB Kingston Fury 3400mhz
Video Card(s) XFX 6950XT Speedster MERC 319
Storage Kingston A400 240GB | WD Black SN750 2TB |WD Blue 1TB x 2 | Toshiba P300 2TB | Seagate Expansion 8TB
Display(s) Samsung U32J590U 4K + BenQ GL2450HT 1080p
Case Fractal Design Define R4
Audio Device(s) Line6 UX1 + some headphones, Nektar SE61 keyboard
Power Supply Corsair RM850x v3
Mouse Logitech G602
Keyboard Cherry MX Board 1.0 TKL Brown
VR HMD Acer Mixed Reality Headset
Software Windows 10 Pro
Benchmark Scores Rimworld 4K ready!
So, for various reasons I am mainly using our company's icloud mail adress for my mailing, which honestly doesn't feel right. My predecessor did it and recently our main adress (the one I use) got completely wiped (synching error in our printer, somehow) and it has just been easier to keep using it. But after the summer holidays I intend to start using the "proper" adress for all my mails, and I have been looking at some clients, but it seems Windows Mail and Mozilla Thunderbird does NOT do what I really need it to do, and what icloud does really well: Search within attached PDF files.

Any tips or ideas?
 
Joined
Jul 25, 2006
Messages
12,136 (1.87/day)
Location
Nebraska, USA
System Name Brightworks Systems BWS-6 E-IV
Processor Intel Core i5-6600 @ 3.9GHz
Motherboard Gigabyte GA-Z170-HD3 Rev 1.0
Cooling Quality case, 2 x Fractal Design 140mm fans, stock CPU HSF
Memory 32GB (4 x 8GB) DDR4 3000 Corsair Vengeance
Video Card(s) EVGA GEForce GTX 1050Ti 4Gb GDDR5
Storage Samsung 850 Pro 256GB SSD, Samsung 860 Evo 500GB SSD
Display(s) Samsung S24E650BW LED x 2
Case Fractal Design Define R4
Power Supply EVGA Supernova 550W G2 Gold
Mouse Logitech M190
Keyboard Microsoft Wireless Comfort 5050
Software W10 Pro 64-bit
You want an email client program that lets you search for words or phrases in attached files, specifically, .pdf files - without opening the attached file?

Hmm, I am not aware of any email program that will do that. I know some email programs will let you search for emails that have attachments, but not within the attachment itself.

For one, the attachment would have to be readable (as opposed to an executable). And then the email program would have to know how to read the readable file format used by the file. For example, it would have to know how to read .pdf, .docx, .csv, .rtf or .txt files. That would be a challenge to program in while keeping bloat down. It might even present some security issues.

I will be interested if someone knows of a program for this. Good luck.
 

Frick

Fishfaced Nincompoop
Joined
Feb 27, 2006
Messages
18,927 (2.86/day)
Location
Piteå
System Name Black MC in Tokyo
Processor Ryzen 5 5600
Motherboard Asrock B450M-HDV
Cooling Be Quiet! Pure Rock 2
Memory 2 x 16GB Kingston Fury 3400mhz
Video Card(s) XFX 6950XT Speedster MERC 319
Storage Kingston A400 240GB | WD Black SN750 2TB |WD Blue 1TB x 2 | Toshiba P300 2TB | Seagate Expansion 8TB
Display(s) Samsung U32J590U 4K + BenQ GL2450HT 1080p
Case Fractal Design Define R4
Audio Device(s) Line6 UX1 + some headphones, Nektar SE61 keyboard
Power Supply Corsair RM850x v3
Mouse Logitech G602
Keyboard Cherry MX Board 1.0 TKL Brown
VR HMD Acer Mixed Reality Headset
Software Windows 10 Pro
Benchmark Scores Rimworld 4K ready!
You want an email client program that lets you search for words or phrases in attached files, specifically, .pdf files - without opening the attached file?

Hmm, I am not aware of any email program that will do that. I know some email programs will let you search for emails that have attachments, but not within the attachment itself.

For one, the attachment would have to be readable (as opposed to an executable). And then the email program would have to know how to read the readable file format used by the file. For example, it would have to know how to read .pdf, .docx, .csv, .rtf or .txt files. That would be a challenge to program in while keeping bloat down. It might even present some security issues.

I will be interested if someone knows of a program for this. Good luck.

Exactly this.
 
Joined
Jul 30, 2019
Messages
2,350 (1.36/day)
System Name Not a thread ripper but pretty good.
Processor Ryzen 9 5950x
Motherboard ASRock X570 Taichi (revision 1.06, BIOS/UEFI version P5.50)
Cooling EK-Quantum Velocity, EK-Quantum Reflection PC-O11, EK-CoolStream PE 360, Alphacool NexXxoS ST25 360
Memory Micron DDR4-3200 ECC Unbuffered Memory (4 sticks, 128GB, 18ASF4G72AZ-3G2F1)
Video Card(s) XFX Radeon RX 5700 & EK-Quantum Vector Radeon RX 5700 +XT & Backplate
Storage Samsung 2TB 980 PRO 2TB Gen4x4 NVMe, Samsung 2TB 970 EVO Plus Gen3x4 NVMe x 2
Display(s) 2 x 4K LG 27UL600-W (and HUANUO Dual Monitor Mount)
Case Lian Li PC-O11 Dynamic Black (original model)
Power Supply Corsair RM750x
Mouse Logitech M575
Keyboard Corsair Strafe RGB MK.2
Software Windows 10 Professional (64bit)
Benchmark Scores Typical for non-overclocked CPU.
So, for various reasons I am mainly using our company's icloud mail adress for my mailing, which honestly doesn't feel right. My predecessor did it and recently our main adress (the one I use) got completely wiped (synching error in our printer, somehow) and it has just been easier to keep using it. But after the summer holidays I intend to start using the "proper" adress for all my mails, and I have been looking at some clients, but it seems Windows Mail and Mozilla Thunderbird does NOT do what I really need it to do, and what icloud does really well: Search within attached PDF files.

Any tips or ideas?

You can try a local or online document storage solution. Something like that should be able to search PDF files. Of course that means you need to save the emails and/or attachments to those systems in order for them to be searchable.
 
Joined
Jul 25, 2006
Messages
12,136 (1.87/day)
Location
Nebraska, USA
System Name Brightworks Systems BWS-6 E-IV
Processor Intel Core i5-6600 @ 3.9GHz
Motherboard Gigabyte GA-Z170-HD3 Rev 1.0
Cooling Quality case, 2 x Fractal Design 140mm fans, stock CPU HSF
Memory 32GB (4 x 8GB) DDR4 3000 Corsair Vengeance
Video Card(s) EVGA GEForce GTX 1050Ti 4Gb GDDR5
Storage Samsung 850 Pro 256GB SSD, Samsung 860 Evo 500GB SSD
Display(s) Samsung S24E650BW LED x 2
Case Fractal Design Define R4
Power Supply EVGA Supernova 550W G2 Gold
Mouse Logitech M190
Keyboard Microsoft Wireless Comfort 5050
Software W10 Pro 64-bit
I agree that will work, but how is that different from just and saving the attachment to some location, then searching through the file in the normal way? The OP is asking for a method of searching through attachments while they are still in his email client's inbox. I don't know of any way to do that.
 
Joined
Jul 30, 2019
Messages
2,350 (1.36/day)
System Name Not a thread ripper but pretty good.
Processor Ryzen 9 5950x
Motherboard ASRock X570 Taichi (revision 1.06, BIOS/UEFI version P5.50)
Cooling EK-Quantum Velocity, EK-Quantum Reflection PC-O11, EK-CoolStream PE 360, Alphacool NexXxoS ST25 360
Memory Micron DDR4-3200 ECC Unbuffered Memory (4 sticks, 128GB, 18ASF4G72AZ-3G2F1)
Video Card(s) XFX Radeon RX 5700 & EK-Quantum Vector Radeon RX 5700 +XT & Backplate
Storage Samsung 2TB 980 PRO 2TB Gen4x4 NVMe, Samsung 2TB 970 EVO Plus Gen3x4 NVMe x 2
Display(s) 2 x 4K LG 27UL600-W (and HUANUO Dual Monitor Mount)
Case Lian Li PC-O11 Dynamic Black (original model)
Power Supply Corsair RM750x
Mouse Logitech M575
Keyboard Corsair Strafe RGB MK.2
Software Windows 10 Professional (64bit)
Benchmark Scores Typical for non-overclocked CPU.
I agree that will work, but how is that different from just and saving the attachment to some location, then searching through the file in the normal way?
Some software has dedicated engines for indexing and searching documents and you probably use their interface instead. Doesn't fit the OP's case exactly and could be overkill but if you have a business requirement for searching documents it might be an option to consider.
The OP is asking for a method of searching through attachments while they are still in his email client's inbox. I don't know of any way to do that.
I did a quick test it seems you can do it if you are using GMail otherwise I don't have a recommendation.
 
Joined
Jul 25, 2006
Messages
12,136 (1.87/day)
Location
Nebraska, USA
System Name Brightworks Systems BWS-6 E-IV
Processor Intel Core i5-6600 @ 3.9GHz
Motherboard Gigabyte GA-Z170-HD3 Rev 1.0
Cooling Quality case, 2 x Fractal Design 140mm fans, stock CPU HSF
Memory 32GB (4 x 8GB) DDR4 3000 Corsair Vengeance
Video Card(s) EVGA GEForce GTX 1050Ti 4Gb GDDR5
Storage Samsung 850 Pro 256GB SSD, Samsung 860 Evo 500GB SSD
Display(s) Samsung S24E650BW LED x 2
Case Fractal Design Define R4
Power Supply EVGA Supernova 550W G2 Gold
Mouse Logitech M190
Keyboard Microsoft Wireless Comfort 5050
Software W10 Pro 64-bit
I did a quick test it seems you can do it if you are using GMail otherwise I don't have a recommendation.
Nice catch! I just sent myself an email with a .pdf file attached. Then, from my gmail inbox, searched all emails for a phrase I knew was in that attachment and I can confirm, A Computer Guy is correct, gmail found the correct email.
 
Joined
Jun 1, 2011
Messages
3,851 (0.82/day)
Location
in a van down by the river
Processor faster at instructions than yours
Motherboard more nurturing than yours
Cooling frostier than yours
Memory superior scheduling & haphazardly entry than yours
Video Card(s) better rasterization than yours
Storage more ample than yours
Display(s) increased pixels than yours
Case fancier than yours
Audio Device(s) further audible than yours
Power Supply additional amps x volts than yours
Mouse without as much gnawing as yours
Keyboard less clicky than yours
VR HMD not as odd looking as yours
Software extra mushier than yours
Benchmark Scores up yours
I did a quick test it seems you can do it if you are using GMail otherwise I don't have a recommendation.
we have a gmail business account and I'm constantly using the search function to look up names in spreadsheets or docs that I may have sent or were sent to me. I must say their search function is a major time saver.
 

Frick

Fishfaced Nincompoop
Joined
Feb 27, 2006
Messages
18,927 (2.86/day)
Location
Piteå
System Name Black MC in Tokyo
Processor Ryzen 5 5600
Motherboard Asrock B450M-HDV
Cooling Be Quiet! Pure Rock 2
Memory 2 x 16GB Kingston Fury 3400mhz
Video Card(s) XFX 6950XT Speedster MERC 319
Storage Kingston A400 240GB | WD Black SN750 2TB |WD Blue 1TB x 2 | Toshiba P300 2TB | Seagate Expansion 8TB
Display(s) Samsung U32J590U 4K + BenQ GL2450HT 1080p
Case Fractal Design Define R4
Audio Device(s) Line6 UX1 + some headphones, Nektar SE61 keyboard
Power Supply Corsair RM850x v3
Mouse Logitech G602
Keyboard Cherry MX Board 1.0 TKL Brown
VR HMD Acer Mixed Reality Headset
Software Windows 10 Pro
Benchmark Scores Rimworld 4K ready!
we have a gmail business account and I'm constantly using the search function to look up names in spreadsheets or docs that I may have sent or were sent to me. I must say their search function is a major time saver.

This is exactly it. I can find the documents easy, but then I have to find the mail convo that spawned them.
 
Joined
Aug 14, 2013
Messages
2,373 (0.61/day)
System Name boomer--->zoomer not your typical millenial build
Processor i5-760 @ 3.8ghz + turbo ~goes wayyyyyyyyy fast cuz turboooooz~
Motherboard P55-GD80 ~best motherboard ever designed~
Cooling NH-D15 ~double stack thot twerk all day~
Memory 16GB Crucial Ballistix LP ~memory gone AWOL~
Video Card(s) MSI GTX 970 ~*~GOLDEN EDITION~*~ RAWRRRRRR
Storage 500GB Samsung 850 Evo (OS X, *nix), 128GB Samsung 840 Pro (W10 Pro), 1TB SpinPoint F3 ~best in class
Display(s) ASUS VW246H ~best 24" you've seen *FULL HD* *1O80PP* *SLAPS*~
Case FT02-W ~the W stands for white but it's brushed aluminum except for the disgusting ODD bays; *cries*
Audio Device(s) A LOT
Power Supply 850W EVGA SuperNova G2 ~hot fire like champagne~
Mouse CM Spawn ~cmcz R c00l seth mcfarlane darawss~
Keyboard CM QF Rapid - Browns ~fastrrr kees for fstr teens~
Software integrated into the chassis
Benchmark Scores 9999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999
One of the stupidly obvious features that everyone should imo copy of Apple operating systems :( I’ve tried a bunch of other apps but have tied myself into Apple for reasons like this

Spark takes advantage of that inline, AI search but, last I checked, they basically read your messages with AI (like Apple does) but without e2e, storing that data (presumably rather than checking it against a database), and all to the end of organizing your mail into folders without letting you know with their dumb AI (literally creates folders and rules on your ), even on paid tiers

Emclient exists but I’ve never tried it because I honestly don’t trust paid clients anymore

I know I found some command-line clients on GitHub years ago but they’re cli email clients so :(

And yes gmail as well I don’t like checking my email in a browser but there are a bunch of electron wrappers if you don’t mind that
 
Joined
Nov 1, 2022
Messages
49 (0.09/day)
If I understand correctly, you need an email client that can perform OCR (optical character recognition) for all incoming emails with PDF files. I do not think there's an email client with such built-in capabilities. As the other members pointed out, it's best to use a separate app for storing/routing/organizing/searching the content of PDFs. At the office, we use filing system software to automate that and minimize the chance of a mistake. But, of course, we deal with heaps of files per day and when it's busy, we scan up to 30 PDFs in a single hour, so without a DMS we'll cave in very soon. Plus, some of the PDFs we then send to teams in other countries, and when they return the edited files via email, the system automatically finds the right folder within our network so we always know which PDF is the latest/final version (there's a lot more to it, but if it doesn't sound like overkill to you, check out this guide). Still, we have to download each PDF so that the software can find a spot for it - that way, nothing that ends up in the spam/junk pile ends up competing with the genuine stuff...
 
Joined
Jul 25, 2006
Messages
12,136 (1.87/day)
Location
Nebraska, USA
System Name Brightworks Systems BWS-6 E-IV
Processor Intel Core i5-6600 @ 3.9GHz
Motherboard Gigabyte GA-Z170-HD3 Rev 1.0
Cooling Quality case, 2 x Fractal Design 140mm fans, stock CPU HSF
Memory 32GB (4 x 8GB) DDR4 3000 Corsair Vengeance
Video Card(s) EVGA GEForce GTX 1050Ti 4Gb GDDR5
Storage Samsung 850 Pro 256GB SSD, Samsung 860 Evo 500GB SSD
Display(s) Samsung S24E650BW LED x 2
Case Fractal Design Define R4
Power Supply EVGA Supernova 550W G2 Gold
Mouse Logitech M190
Keyboard Microsoft Wireless Comfort 5050
Software W10 Pro 64-bit
If I understand correctly, you need an email client that can perform OCR (optical character recognition) for all incoming emails with PDF files.
Huh? No where did the OP mention anything about OCR. And TBH, I am not sure you understand what OCR is. It seems you misunderstood the OPs request, or you are hawking that software you linked to. :(

OCR software works with a document scanner - a piece of hardware that uses the same technologies as fax machines and copying machines to scan in printed documents and then send them electronically (fax) or make a copy of them.

The OCR is software that works with a piece of paper - NOT an emailed document - but an actual piece of paper with text already printed on it that has been scanned in with the scanner. The user puts that piece of paper in a flatbed scanner, and scans it in (like you would when faxing or photocopying something).

It is important to note that printed documents are essentially just like pictures or images. The OCR software attempts to "recognize" letters in those images and converts them into words in an editable document.

I note you said so yourself that you "scan" up to 30 PDFs per hour. And your link clearly talks about scanning "printed" documents (and includes an icon for a flatbed scanner).

The OP is not asking for that. He is asking for an email program that is able "search" attached documents that are already in text form, in this case, .pdf documents.
 
Joined
Jun 21, 2021
Messages
2,654 (2.57/day)
System Name daily driver Mac mini M2 Pro
Processor Apple Silicon M2 Pro (6 p-cores, 4 e-cores)
Motherboard Apple proprietary
Cooling Apple proprietary
Memory Apple proprietary 16GB LPDDR5 unified memory
Video Card(s) Apple Silicon M2 Pro (16-core GPU)
Storage Apple proprietary 512GB SSD + various external HDDs
Display(s) LG 27UL850W (4K@60Hz IPS)
Case Apple proprietary
Audio Device(s) Apple proprietary
Power Supply Apple proprietary
Mouse Apple Magic Trackpad 2
Keyboard Keychron K1 tenkeyless (Gateron Reds)
Software macOS Ventura 13.6 (including latest patches)
Benchmark Scores (My Windows daily driver is a Beelink Mini S12. I'm not interested in benchmarking.)
Apple Mail on macOS does indeed do this. I checked by searching on a term that is frequently mentioned in a PDF newsletter I receive but never in the e-mail message body itself.

I then tried Mozilla Thunderbird which is connected to the same IMAP account. Zero search results.

So I can confirm one MUA that does, one that does not.

Macs do not need an OCR to process PDFs. The macOS/OS X operating system itself handled the PDF document format natively from the very beginning, probably something inherited from its nextSTEP ancestry.
 
Joined
Nov 1, 2022
Messages
49 (0.09/day)
Huh? No where did the OP mention anything about OCR. And TBH, I am not sure you understand what OCR is.
The OCR is software that works with a piece of paper - NOT an emailed document - but an actual piece of paper with text already printed on it that has been scanned in with the scanner.
Apologies if my recommendation doesn't quite hit the mark, but also, I don't think it's that far off. Definitely not a reason for the tone in your response. It's unwarranted. I only shared what I know, and plus - your criticism is not exactly on point. I haven't tried all software that implements OCR to search keywords within PDF files, but I'm pretty sure that such apps can work with all of the usual formats (.doc, .jpg, Excel, and even power point) and convert them so that the user can perform a search. So, that indeed goes for emailed documents as well - as long as you have the file on your PC, you can convert it into an editable PDF, and then proceed
I note you said so yourself that you "scan" up to 30 PDFs per hour. And your link clearly talks about scanning "printed" documents (and includes an icon for a flatbed scanner).
Obviously, I myself do not scan that many files per hour, I was referring to the entire team...
The OP is not asking for that. He is asking for an email program that is able "search" attached documents that are already in text form, in this case, .pdf documents.
I explained that in the beginning of my post, I don't think there's an email client that can "comb" through incoming attachments. Again, it all sounds like OCR to me, that's why I shared my office experience; really, really didn't meant to upset anyone... Finally, there are tons of apps that can do that, no one has to stick to my post when deciding...
 
Joined
Jul 25, 2006
Messages
12,136 (1.87/day)
Location
Nebraska, USA
System Name Brightworks Systems BWS-6 E-IV
Processor Intel Core i5-6600 @ 3.9GHz
Motherboard Gigabyte GA-Z170-HD3 Rev 1.0
Cooling Quality case, 2 x Fractal Design 140mm fans, stock CPU HSF
Memory 32GB (4 x 8GB) DDR4 3000 Corsair Vengeance
Video Card(s) EVGA GEForce GTX 1050Ti 4Gb GDDR5
Storage Samsung 850 Pro 256GB SSD, Samsung 860 Evo 500GB SSD
Display(s) Samsung S24E650BW LED x 2
Case Fractal Design Define R4
Power Supply EVGA Supernova 550W G2 Gold
Mouse Logitech M190
Keyboard Microsoft Wireless Comfort 5050
Software W10 Pro 64-bit
Definitely not a reason for the tone in your response. It's unwarranted.
I am sorry you got your feelings hurt, but you cannot "hear" my tone or "see" my facial expressions or body language. So any "tone" you are interjecting is being put there by you based on your own "misperceptions" and obvious biases.

I was simply stating in a matter of fact, frank manner, which IMO, is appropriate for technical discussions. You appear to have gotten your feelings hurt by misconstruing what was said, and reading into the text words NOT said. :( Again, apologies your feelings were hurt, but that is what is unwarranted. So I would appreciate you not interject things not actually said, and just go by what is actually said. Thank you.

The OP stated what he wanted. To ensure I didn't make "unwarranted" assumptions, I rephrased his request to clarify and to confirm we were on the same page. He confirmed we were by replying, "Exactly this".

But you went on your "pitch" for OCR software - and you clearly are still trying to pitch it.

The OP does not need "OCR" software. PDF files already consist of characters that can be recognized, even if the file has been locked by the author and cannot be edited.

Frich stated he simply wants an email client that will allow him to search within email attachments. A Computer Guy correctly noted that gmail allows this. Again, no OCR software required.
 
Joined
Aug 14, 2013
Messages
2,373 (0.61/day)
System Name boomer--->zoomer not your typical millenial build
Processor i5-760 @ 3.8ghz + turbo ~goes wayyyyyyyyy fast cuz turboooooz~
Motherboard P55-GD80 ~best motherboard ever designed~
Cooling NH-D15 ~double stack thot twerk all day~
Memory 16GB Crucial Ballistix LP ~memory gone AWOL~
Video Card(s) MSI GTX 970 ~*~GOLDEN EDITION~*~ RAWRRRRRR
Storage 500GB Samsung 850 Evo (OS X, *nix), 128GB Samsung 840 Pro (W10 Pro), 1TB SpinPoint F3 ~best in class
Display(s) ASUS VW246H ~best 24" you've seen *FULL HD* *1O80PP* *SLAPS*~
Case FT02-W ~the W stands for white but it's brushed aluminum except for the disgusting ODD bays; *cries*
Audio Device(s) A LOT
Power Supply 850W EVGA SuperNova G2 ~hot fire like champagne~
Mouse CM Spawn ~cmcz R c00l seth mcfarlane darawss~
Keyboard CM QF Rapid - Browns ~fastrrr kees for fstr teens~
Software integrated into the chassis
Benchmark Scores 9999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999
I’m glad leeantheone is so generous with their patience but yeesh

Anyway just want to add that all pdfs are not searchable. Adobe and other pdf readers, many AIs, like the ones used in Apple’s operating systems I mentioned above (and gmail as others have) or the ones used with digital images on traffic lights to read license plates, and many other technologies use OCR to extract text from all sorts of digital content, whether it’s a pdf or a digital image (in fact, even scanners and fax machines rely on a digital image to function — what else is the OCR reading from?). OCR has been an essential feature to how pdf software functions for a long time.

/OT
 
Last edited:
Joined
Jul 25, 2006
Messages
12,136 (1.87/day)
Location
Nebraska, USA
System Name Brightworks Systems BWS-6 E-IV
Processor Intel Core i5-6600 @ 3.9GHz
Motherboard Gigabyte GA-Z170-HD3 Rev 1.0
Cooling Quality case, 2 x Fractal Design 140mm fans, stock CPU HSF
Memory 32GB (4 x 8GB) DDR4 3000 Corsair Vengeance
Video Card(s) EVGA GEForce GTX 1050Ti 4Gb GDDR5
Storage Samsung 850 Pro 256GB SSD, Samsung 860 Evo 500GB SSD
Display(s) Samsung S24E650BW LED x 2
Case Fractal Design Define R4
Power Supply EVGA Supernova 550W G2 Gold
Mouse Logitech M190
Keyboard Microsoft Wireless Comfort 5050
Software W10 Pro 64-bit
We are still not on the same page and I'll accept a big chunk of the blame here.

(in fact, even scanners and fax machines rely on a digital image to function — what else is the OCR reading from?).
Yes, scanners and fax machines rely on digital "images". That is 100% true.

But the purpose for OCR software is for the computer (not scanner, not fax machine) to view that scanned-in "image", look for and "recognize" images of "text characters" (letters and numbers) and convert those images (pictures of letters and words) into a text document that can then be edited by a word processor or similar program.

The OCR software is not reading from the scanner or fax. It is reading the scanned image in the computer's memory.

OCR has been an essential feature to how pdf software functions for a long time.
Not really. OCR is old. PDF is relatively new. OCR and PDF software are totally different and unrelated. You can use OCR software to look for text images in a .pdf file, but the point is, it must be scanned-in first as an image. Then characters must then be recognized, then converted into an editable text file.

Images of license plates are first recorded with a camera. That's an important fact.

OCR software was created at least 20 years before .pdf.

We are in the same chapter of the same book, but not the same page.

Anyway just want to add that all pdfs are not searchable.
That depends on how the .pdf file was created. "IF" the .pdf file was originally created by scanning in the document with a scanner/fax machine, or via a camera, then you need OCR software to convert the text "images" (bitmaps) into "real" text.

If the .pdf file was created using Adobe Acrobat, Google Docs, or Microsoft Word, then the file can be searched without using an OCR program.
 
Joined
Aug 14, 2013
Messages
2,373 (0.61/day)
System Name boomer--->zoomer not your typical millenial build
Processor i5-760 @ 3.8ghz + turbo ~goes wayyyyyyyyy fast cuz turboooooz~
Motherboard P55-GD80 ~best motherboard ever designed~
Cooling NH-D15 ~double stack thot twerk all day~
Memory 16GB Crucial Ballistix LP ~memory gone AWOL~
Video Card(s) MSI GTX 970 ~*~GOLDEN EDITION~*~ RAWRRRRRR
Storage 500GB Samsung 850 Evo (OS X, *nix), 128GB Samsung 840 Pro (W10 Pro), 1TB SpinPoint F3 ~best in class
Display(s) ASUS VW246H ~best 24" you've seen *FULL HD* *1O80PP* *SLAPS*~
Case FT02-W ~the W stands for white but it's brushed aluminum except for the disgusting ODD bays; *cries*
Audio Device(s) A LOT
Power Supply 850W EVGA SuperNova G2 ~hot fire like champagne~
Mouse CM Spawn ~cmcz R c00l seth mcfarlane darawss~
Keyboard CM QF Rapid - Browns ~fastrrr kees for fstr teens~
Software integrated into the chassis
Benchmark Scores 9999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999
I mean, this is all off-topic, but seems like OP has checked out, so… I dunno, not like this matters, either — really not sure what the point of this exercise is :oops:

The OCR software is not reading from the scanner or fax. It is reading the scanned image in the computer's memory.
This was true some time ago, but nowadays there are all sorts of portable document scanners and “pens”/“markers” that both scan and convert to text in one device. Even if we were to refer to things as far back as the 70’s, those devices were often computers, too.
Not really. OCR is old. PDF is relatively new. OCR and PDF software are totally different and unrelated. You can use OCR software to look for text images in a .pdf file, but the point is, it must be scanned-in first as an image. Then characters must then be recognized, then converted into an editable text file.
IDK what the point is here, and wonder if companies like Adobe would disagree with you on their being unrelated. I used to scan a lot of books for high school and college debate teams, and it wasn’t until the late 2000’s (acrobat 9?) that Adobe integrated OCR technology, which was a lifesaver for me and anyone who scanned doscuments.
Images of license plates are first recorded with a camera. That's an important fact.
I said as much? I think this is an important fact because AI can pull text from images using OCR. Why are you emphasizing this point?
OCR software was created at least 20 years before .pdf.
IDK about that, at least if you mean software as in software that runs on a PC, which isn’t a distinction you were making before.


Earlier you had said that OCR is not used to parse text from emails containing documents/images. My point was merely to say that this is untrue, AFAIK. In my understanding, apps like gmail and Apple’s “live text” feature use OCR in exactly this fashion. Adobe was more blunt about it, although admittedly not using AI, calling the feature to convert PDFs that contained text that were not created by text editors into text OCR text recognition.
 
Top