Evernote PDF search highlighting problem makes paperless office unworkable
#1
Posted 24 June 2012 - 09:09 PM
But, despite numerous promises, Evernote have failed to deliver a PDF search that reliably highlights the searched text in the search results.
Usually I end up with one result (the first, as a rule) with correctly highlighted text, and the rest of the list either have no highlighted results, or can have all kinds of random highlight spots, often on totally blank areas.
Evernote have been aware of this problem for many months now, but it has not been addressed in any of their updates.
I'm very disappointed and frustrated by this 'Evernote Premium' behaviour.
#2
Posted 24 June 2012 - 09:12 PM
#3
Posted 24 June 2012 - 09:23 PM
From Evernote:
Ticket # 16051-80087
Dear Roy,
The PDF highlighting not aligning is currently a known issue that is being investigated. We appreciate you reporting this to us.
From user: May 10th
icoco has just posted a reply to a topic that you have subscribed to titled "BUG: Evernote not highlighting text in PDF search results".
----------------------------------------------------------------------
I have the same request. I'd certainly expect to have text or passages highlighted when performing a search in any document. This should apply to PDF documents, too. Evernote seems to find the PDF document that contains the search term, but doesn't mark the text or passage. Which mostly is quite cumbersome, especially regarding long documents like manuals etc.
Please, Evernote folks, do something about that issue.
Regards.
#4
Posted 24 June 2012 - 09:24 PM
I should have made clear that this is only an issue with documents scanned to PDF in Evernote Premium. Searching on all other documents is fine.
#5
Posted 27 June 2012 - 05:29 PM
#6
Posted 04 September 2012 - 03:24 AM
#7
Posted 06 September 2012 - 06:12 AM
#8
Posted 06 September 2012 - 07:49 PM
#9
Posted 07 September 2012 - 07:13 PM
Thanks for mentioning ScanSnap's OCR. It works for me too.
But I came to this whole 'paperless office' adventure through Brooks Duncan's excellent DocumenSnap website, and, if I remember correctly (it's many months ago now) his advice was to use Evernote's OCR to avoid having to wait for each ScanSnap document scan to be OCRed. Using Evernote's PDF OCR should theoretically allow continuous scanning, while Evernote applies its OCR in the background. And, it does work in a kind of a way. But as my original post pointed out it is completely unreliable. And despite Evernote being aware of the issue, and having issued many updates since I raised it with them, they can't seem to solve the problem.
With Evernote so excellent in many other areas, I'm puzzled by this long-standing issue. Non-PDF clippings search works perfectly for me. But, since I got my ScanSnap setup specifically for going paperless with Evernote Premium, I'm reluctant to end up going back to the slower (but effective) ScanSnap OCR.
I hope repeated complaints from users like us can eventually persuade Evernote to resolve the issue, or, at least, explain why they can't.
#10
Posted 07 September 2012 - 07:17 PM
http://www.documentsnap.com
#11
Posted 07 September 2012 - 07:30 PM
I'm on the Mac, OS X 10.8 and 10.7 on another machine.
Thanks for the offer, and I hope you can manage it .. it would be great, at least until Evernote solve the PDF OCR search problem.
And thanks again for DocumentSnap - it's a great paperless resource.
Hi pipkato, thanks for your kind words re: DocumentSnap! Are you on Mac or Windows? I might be able to come up with a workflow where the ScanSnap OCR runs in the background, and then puts it in Evernote. Then you'd get a best of both worlds sort of deal.
#12
Posted 10 December 2012 - 10:06 AM
#13
Posted 10 December 2012 - 10:23 AM
http://discussion.ev...e/#entry173506
Evernote Manual (Mac) http://evernote.com/...note/guide/mac/
Evernote Manual (Windows) http://evernote.com/.../guide/windows/
Evernote Manual (iOS) http://evernote.com/...note/guide/ios/
My Site http://www.princeton...mayo/index.html
#14
Posted 12 December 2012 - 05:33 PM
Dear Valued Customer,
The highlighting is indeed a bug and there isn't a viable workaround at this time. It is something that we plan on fixing, but I don't have an estimate as to when that may be available. I'm truly sorry for the trouble. I wish I had better news for you. Please let me know if you have any questions.
My original question below:
I have many documents in PDF format. One of the reasons I went Premium was the ability of EN to search these documents. While EN finds the document/note the text is not highlighted as it would be when searching an image. This is a significant negative for me as I hope to upload a considerable number of larger PDF files for searching. From searching your forum this seems to be an issue that has not been resolved? If there isn't a work around for this please let me know if you plan to address it in future software updates. Thanks in advance.
#15
Posted 02 January 2013 - 02:11 AM
I'm a big fan of Evernote in general and I'm using the Premium version of Evernote for some time now with a Fujitsu ScanSnap 1500. I had great dreams of using the Evernote 'searchable PDF' option to scan in all my paper and find stuff using text search when needed.
But, despite numerous promises, Evernote have failed to deliver a PDF search that reliably highlights the searched text in the search results.
Usually I end up with one result (the first, as a rule) with correctly highlighted text, and the rest of the list either have no highlighted results, or can have all kinds of random highlight spots, often on totally blank areas.
Evernote have been aware of this problem for many months now, but it has not been addressed in any of their updates.
I'm very disappointed and frustrated by this 'Evernote Premium' behaviour.
I too am quite disappointed since now i've paid for the premium membership, only to find the main feature of interest to me, PDF searching....borders on useless. I feel as if I was misled with the generalization of "searchable PDF's"
So I have been able to determine the following facts from my short experience. I hope this helps another user in their decision to invest $45.00 into a SAS that will not meet their needs with "features" that are so generally suggested.
1) There are two types of PDF's, one type is created by scanning paper documents, and the other type is by "printing" or saving a document as a PDF. When scanning a document, and NOT using OCR EN will reject recognition if any of the following is TRUE:
- The PDF contains more than 100 pages
- The PDF file is more than 25MB
- The PDF does not contain at least one "scanned" page, defined as:
- A "scanned" page contains at least 1025 pixels of image data
- A "scanned" page contains no more than 512 characters of regular, searchable text (e.g. this is enough for a text-based fax header or similar). PDF files that have already been processed by a separate OCR system will not satisfy this condition and will be rejected.
- The PDF contains no more than one non-scanned page. (I.e. the doc may have one "cover" page without any image data, but if there's more than one, than it's not a real scan and we reject it.)
- The analysis crashes or fails for some technical reason, typically due to a malformed PDF from some crazy source, or if the PDF is password protected (encrypted).
- This analysis process takes more than 30 seconds to complete.
2)The other type of PDF, mentioned above, is created by some other software from a document. For instance a multi page Word document, saved as a PDF. A PDF of this type has a hidden layer built in that contains an index of all the text in the PDF. When importing this type of PDF into EN, EN seems to index and recognize only some of the text. So again, when searching for a sting of text from your list of notes, EN will pin down the PDF, but as before does not go to the exact page, or highlight the text. The only caveat to this type of file is that a Windows user can press CTRL-F to invoke a search box located at the bottom of the screen. You have to RE-ENTER the search string, then the first instance of the string will be found within the document. More useable, but still a LOT functionality less than I had expected.
So in summary, the PDF search feature is far from acceptable in my experience.
FYI, I am using a Windows client, and have identical functionailty on the web client, with the exception of the CTRL-F option...for the web client I have to open the PDF and use the search function to re-enter my search string.
#16
Posted 02 January 2013 - 02:24 AM
Of course, this isn't for everyone, and it doesn't address the problems mentioned above, or the problems I have had (mentioned elsewhere), but it is a workaround that I have been using with great success for about three months now. I am paperless, and my Evernote database is hovering somewhere around 900 MB, with about 10,000 notes inside it.
Evernote Manual (Mac) http://evernote.com/...note/guide/mac/
Evernote Manual (Windows) http://evernote.com/.../guide/windows/
Evernote Manual (iOS) http://evernote.com/...note/guide/ios/
My Site http://www.princeton...mayo/index.html
#17
Posted 02 January 2013 - 02:31 AM
Why?
1.) Exported PDFs:
ScanSnap: The PDF document remains OCR'd if I export it from Evernote.
Evernote: The PDF document loses its OCR if I export it from Evernote.
2.) Consistency:
ScanSnap: The search results are consistent in Evernote, whether I view them from my desktop client or the Evernote web.
Evernote: The search results are not consistent because Evernote uses different OCR software depending on the platform.
3.) 100% OCR:
ScanSnap: Works on notes that are stored in my local non-sync'd Evernote notebooks.
Evernote: Evernote cannot see my notes on my local non-sync'd notebooks, so the PDF's cannot be OCR'd.
4.) No complex difficult-to-understand rules:
ScanSnap: OCR's all my PDF's - no rules and I know it is done.
Evernote: Evernote has 5 technical rules to follow and no warning if the document fails to meet all the rules
#18
Posted 02 January 2013 - 02:57 AM
It takes a bit longer, but I always let ScanSnap do the OCR for me. (100% of the time)
Why?
1.) Exported PDFs:
ScanSnap: The PDF document remains OCR'd if I export it from Evernote.
Evernote: The PDF document loses its OCR if I export it from Evernote.
2.) Consistency:
ScanSnap: The search results are consistent in Evernote, whether I view them from my desktop client or the Evernote web.
Evernote: The search results are not consistent because Evernote uses different OCR software depending on the platform.
3.) 100% OCR:
ScanSnap: Works on notes that are stored in my local non-sync'd Evernote notebooks.
Evernote: Evernote cannot see my notes on my local non-sync'd notebooks, so the PDF's cannot be OCR'd.
4.) No complex difficult-to-understand rules:
ScanSnap: OCR's all my PDF's - no rules and I know it is done.
Evernote: Evernote has 5 technical rules to follow and no warning if the document fails to meet all the rules
For Mac users, this means Spotlight indexing. For iPad users who use my text extraction suggestion (I recommend Automator) it means offline searching on the iPad, search results highlighted on the iPad, searching within the note as well, and the ability to download your entire account and keep it offline. In fact, I would go so far as to say textifying my PDFs has opened up a whole new world of possibilities for the iPad. The first step, though, is doing it yourself, as JB recommends. Two days ago, using the multiple file function in Adobe Acrobat Pro, I had the computer finish several thousand files in the morning, so it isn't even a time consuming task to OCR yourself.
Evernote Manual (Mac) http://evernote.com/...note/guide/mac/
Evernote Manual (Windows) http://evernote.com/.../guide/windows/
Evernote Manual (iOS) http://evernote.com/...note/guide/ios/
My Site http://www.princeton...mayo/index.html
#19
Posted 02 January 2013 - 03:15 AM
For some people, a useful workaround might be to extract the text from the PDF and put that into the Evernote note with the PDF attachment. Alternatively, you could just leave the PDF out entirely. This has several benefits. I have written more about this here (http://discussion.ev...ce/#entry173506).
Of course, this isn't for everyone, and it doesn't address the problems mentioned above, or the problems I have had (mentioned elsewhere), but it is a workaround that I have been using with great success for about three months now. I am paperless, and my Evernote database is hovering somewhere around 900 MB, with about 10,000 notes inside it.
Thanks for the suggestion.
Unfortunatly, I am dealing with techncial documents, schematics and the like with a mixture of mechanical assembly explosions, pictures, and other resourceful images.
Raw text won't work in my case.
JJ
#20
Posted 02 January 2013 - 03:42 AM
It takes a bit longer, but I always let ScanSnap do the OCR for me. (100% of the time)
Why?
1.) Exported PDFs:
ScanSnap: The PDF document remains OCR'd if I export it from Evernote.
Evernote: The PDF document loses its OCR if I export it from Evernote.
2.) Consistency:
ScanSnap: The search results are consistent in Evernote, whether I view them from my desktop client or the Evernote web.
Evernote: The search results are not consistent because Evernote uses different OCR software depending on the platform.
3.) 100% OCR:
ScanSnap: Works on notes that are stored in my local non-sync'd Evernote notebooks.
Evernote: Evernote cannot see my notes on my local non-sync'd notebooks, so the PDF's cannot be OCR'd.
4.) No complex difficult-to-understand rules:
ScanSnap: OCR's all my PDF's - no rules and I know it is done.
Evernote: Evernote has 5 technical rules to follow and no warning if the document fails to meet all the rules
Ok, understood.
But here are my questions about your above suggestion/explanation:
1) Is the consistent experience you describe above, found also on the Windows client? I know there are differences between Mac and Windows clients. Which are you basing your experience from?
2) The scanner adds another several hundred dollars, when I already have a fully capable, networked, 100 page duplex ADF scanner.
3) I also have PDF's outputted by various software packages. How is your suggestion (search experience) above different than my current experience if my PDF already has a text index or layer created by the outputting software?
4) I have tried to use the OCR funtions of Acrobat 8 when scanning technical documents, but I end up with all sorts of goofy formatting. The end result of Acrobats OCR is just an unusable mess. Not to mention my documents are a mixture of English and Italian text. I think that throws off the Acrobat OCR.
I don't mind spending several hundred dollars to get the right tools together that are needed, but if the end result is still mediocre search results (notes search only gets me to the first page of a PDF), then I just outlayed some dough for nothing. Not something I like to do.
Thanks,
JJ
Also tagged with one or more of these keywords: paperless, pdf, searchable
Paperless
Learn & Share →
Evernote Lifestyles →
For Paperless, Download Statements into Evernote or Just Leave On-LineStarted by aukirk, Today, 01:44 AM |
|
|
||
Mac
Evernote Products →
Evernote →
Mac OS X 10.8.3 - Can I 'print' directly to an Evernote note?Started by Goffredo, 20 May 2013 |
|
|
||
Windows
Evernote Products →
Evernote →
Side panel for quick PDF navigationStarted by wellshaman, 15 May 2013 |
|
|
||
Firefox
Evernote Products →
Evernote Web Clipper →
How to clip PDF in Firefox on Mac?Started by cdwilson, 08 May 2013 |
|
|
||
Android
Evernote Products →
Evernote →
Doxie or Fujitsu or Other ScannerStarted by mswireman, 08 May 2013 |
|
|
2 user(s) are reading this topic
0 members, 1 guests, 0 anonymous users
-
Bing (1)












