Jump to content
mdg

mac (Archived) OCR Confusion

Recommended Posts

I'm a longtime Evernote user, but I'm only now starting to experiment with OCR recognition...and it's not going very well! (For clarification, I'm working with jpegs, as I don't, at present, have a premium account.)

I understand from some other posts here that OCR recognition can take a while to kick in. With this one image I'm experimenting on, however, some of the text was immediately searchable; some remains unsearchable 30 minutes after posting. I'm not dealing with handwriting, but with a photo that contains very clear text. Am I missing something? Is it possible for part of the text to be immediately searchable, while the rest takes time to become "available"?

I'm gearing up to use Evernote for a complex project that will include scanned texts and handwritten notes, and I need to know it's going to work for me before I invest more time and energy setting it up.

Any thoughts/suggestions??

Thanks.

Share this post


Link to post

The OCRing/indexing is done on the EN servers. If you're using a desktop app (Mac/Windows), you need to first upload the note(s) to be indexed. Wait. Then sync again, so the indexing is available on your desktop. The OCRing/indexing of images differs from other OCR in that it will produce a tree of possibilities. IOW, a picture of a sign with the word "house" on it may show up when searching for the word "horse".

Share this post


Link to post

I find myself "testing" searches to see how readable my handwriting is for OCR. Is there a table somewhere that shows and index of all searched and OCRed words? I mean is this list buit into Evernote, or is there a way to access the words for a note that is an image only?

Share this post


Link to post

I just upgraded to Premium after many years of use. I have lots of PDFs attached to notes. Now that I am premium, will the EN servers automatically start running OCR against all of the PDFs in my existing notes, or do I somehow have to trigger that process. (I'm hoping I don't have to recreate the note!)

 

Thanks.

Share this post


Link to post

mdh, responding in both places just so you know... the Evernote servers should start reindexing automatically. Syncing will bring down that latest indexes for you.

Share this post


Link to post

Browncoat, thank you for the really fast reply. I'm sort of watching this one particular file as my test case. It's been probably 3 days since I upgraded to Premium and that file still hasn't been indexed according to the note's information panel. I don't really have that many notes, certainly fewer than 100. Is 3 days a reasonable amount of time to pass without all of those notes (let's assume every one has a PDF) having been indexed? Also, as a second question, is there any way to force a note to be indexed, kind of bringing it to the head of the queue?

 

Thank you.

Share this post


Link to post

Have a look at the note using our web client. Is it indexed there?

 

If not, try making a quick modification to the note to see if that triggers it (add a space or something). See if that indexes it. (wait a few hours)

 

If not, feel free to export the note and send it to me via private message (if you feel comfortable). I'll see if it's even something we can index.

Share this post


Link to post

I was going to suggest something similar as a trigger. Add even a single character or other change to the title, then Save and Sync. See if that does it.

 

For $90 you can get the life-time version of VueScan, PC or Mac, which works with, I think, EVERY scanner. It has a built in scanning feature to scan to 1-bit (black or white, like faxes) at 200-300dpi, the provide a "ride-along" OCR of that scan. Even works on most receipts. I scanned "42 screenfuls" of strip-receipts last week, into a 1.4mb, multipage PDF file, and the entire doc is searchable. PDFs are much smaller than JPGs, and you can get this very high-end scanning software that does so much more for the equivalent of 18 months Premium Evernote (Which you sill STILL WANT).

 

—D

Share this post


Link to post

Hopefully I'm not hijacking this thread, but my questions are similar to mdg's..

Context, fairly new Evernote (free) user.  I've attached several scanned pdf's to Evernote Windows over the past few months.  NONE of them appear to be OCR processed.

 

I've found some pertienent info:

1. https://support.evernote.com/ics/support/KBAnswer.asp?questionID=552&hitOffset=31&docID=2769

this says "You can see how Evernote has OCRd any particular PDF in your account by
right-clicking the PDF and selecting "Save as Searchable PDF*", then
saving that file and opening it. If you don't have the "Save as
Searchable PDF" option, you have not received the results from our
server yet."  My results, none of my scanned pdf's have the "Save as Searchable PDF" choice.

 

2. https://support.evernote.com/ics/support/KBAnswer.asp?questionID=591&hitOffset=162+47&docID=2807

This describes why a pdf scan might fail ocr.  As best as I can work through all those double-negatives, my scans should not be rejected.

 

So I'm wondering if MAYBE

a. OCR has very recently been limited to only premiums accounts?

b. Free accounts don't process scanned PDF's but only JPEG's

c. Evernote OCR servers have become overload and wait times of weeks to months are common?

d. there's some EN configuration setting I need to change.

e. EN OCR is so unreliable, I should ALWAYS OCR prior to attaching to EN.

f.  Or the most likely case, I've missed something obvious...

 

Any help would be appreciated

 

 

Share this post


Link to post

Hopefully I'm not hijacking this thread, but my questions are similar to mdg's..

Context, fairly new Evernote (free) user. I've attached several scanned pdf's to Evernote Windows over the past few months. NONE of them appear to be OCR processed.

I've found some pertienent info:

1. https://support.evernote.com/ics/support/KBAnswer.asp?questionID=552&hitOffset=31&docID=2769

this says "You can see how Evernote has OCRd any particular PDF in your account by

right-clicking the PDF and selecting "Save as Searchable PDF*", then

saving that file and opening it. If you don't have the "Save as

Searchable PDF" option, you have not received the results from our

server yet." My results, none of my scanned pdf's have the "Save as Searchable PDF" choice.

2. https://support.evernote.com/ics/support/KBAnswer.asp?questionID=591&hitOffset=162+47&docID=2807

This describes why a pdf scan might fail ocr. As best as I can work through all those double-negatives, my scans should not be rejected.

So I'm wondering if MAYBE

a. OCR has very recently been limited to only premiums accounts?

b. Free accounts don't process scanned PDF's but only JPEG's

c. Evernote OCR servers have become overload and wait times of weeks to months are common?

d. there's some EN configuration setting I need to change.

e. EN OCR is so unreliable, I should ALWAYS OCR prior to attaching to EN.

f. Or the most likely case, I've missed something obvious...

Any help would be appreciated

Hi. Welcome to the forums!

As far as I know, OCR service for PDFs has always been a Premium feature (see Evernote Podcast no. 9 from 2009), and it remains one.

http://evernote.com/premium/

Share this post


Link to post
Guest
This topic is now closed to further replies.

×
×
  • Create New...