I’ve started using Evernote around 10 years ago, drawn by the promise of its robust search functionality. Particularly, the ability to create searchable PDFs out of multilingual scanned files. At the time I was using OCR software, but it was dealing poorly with documents containing multiple languages, never mind things like relatively old Japanese printed materials (pre or post-war, we are not talking ancient history here). The Evernote was just perfect in what it could do - to the point I disabled all the PDF indexing options my scanning software had and deferred it all to Evernote. The results were just much much better, without putting any workload on me whatsoever.
Since the pandemic started, I’ve continued to put new research material in my Evernote, but I had far fewer occasions to actually pull that data for the interpretation/translation work I do… so I haven’t noticed my PDFs are not being processed by the OCR software until very recently. I have some free time in August and I have to fix the issue somehow. I know that a lot of my PDFs going as far back as 2019 have not been processed.
It is a weird situation where they do pop up in search results, suggesting that the Evernote has indexed them, but at the same time the search terms are not being highlighted within the PDFs themselves, so the OCR data has not been put back in the PDFs. Some older PDFs seem to be processed as expected (they both show in the results and show the highlights).
I now need to get all my notes containing PDFs (relatively simple search, but the client seems to have a limit on number of items it shows in the results) and either force Evernote to re-scan them and OCR them properly, or I need to somehow batch-process them with OCR software on my desktop - I know Evernote can allow external app to edit attached files, but I’m not sure how to automate the process with the current edition.
I’m using Mac OS X 11.5 and the newest version of Evernote client. I’ve been premium user for years and that puts me on Personal plan right now, I believe.
My assumption that all the PDFs should be OCRed and made searchable by Evernote is mostly based on this article:
and my own experience from the past, when all the PDFs I’d put in Evernote would come up in the searches, with relevant terms highlighted. I’m sure that the PDFs I’m having problems with, in great majority, do not exceed limits described in the above-linked article.
Is there anything I can do to fix my library of PDFs? Or should I start looking for a new home for them?
Idea
Kishi 4
I’ve started using Evernote around 10 years ago, drawn by the promise of its robust search functionality. Particularly, the ability to create searchable PDFs out of multilingual scanned files. At the time I was using OCR software, but it was dealing poorly with documents containing multiple languages, never mind things like relatively old Japanese printed materials (pre or post-war, we are not talking ancient history here). The Evernote was just perfect in what it could do - to the point I disabled all the PDF indexing options my scanning software had and deferred it all to Evernote. The results were just much much better, without putting any workload on me whatsoever.
Since the pandemic started, I’ve continued to put new research material in my Evernote, but I had far fewer occasions to actually pull that data for the interpretation/translation work I do… so I haven’t noticed my PDFs are not being processed by the OCR software until very recently. I have some free time in August and I have to fix the issue somehow. I know that a lot of my PDFs going as far back as 2019 have not been processed.
It is a weird situation where they do pop up in search results, suggesting that the Evernote has indexed them, but at the same time the search terms are not being highlighted within the PDFs themselves, so the OCR data has not been put back in the PDFs. Some older PDFs seem to be processed as expected (they both show in the results and show the highlights).
I now need to get all my notes containing PDFs (relatively simple search, but the client seems to have a limit on number of items it shows in the results) and either force Evernote to re-scan them and OCR them properly, or I need to somehow batch-process them with OCR software on my desktop - I know Evernote can allow external app to edit attached files, but I’m not sure how to automate the process with the current edition.
I’m using Mac OS X 11.5 and the newest version of Evernote client. I’ve been premium user for years and that puts me on Personal plan right now, I believe.
My assumption that all the PDFs should be OCRed and made searchable by Evernote is mostly based on this article:
https://help.evernote.com/hc/en-us/articles/208313388-Tips-for-searching-scanned-PDFs
and my own experience from the past, when all the PDFs I’d put in Evernote would come up in the searches, with relevant terms highlighted. I’m sure that the PDFs I’m having problems with, in great majority, do not exceed limits described in the above-linked article.
Is there anything I can do to fix my library of PDFs? Or should I start looking for a new home for them?
Link to comment
4 replies to this idea
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now