Jump to content
  • 0

Evernote simply not OCRing my PDFs?


Josh Marshall

Idea

Posted

I've used evernote for some time and having just bought the latest snapscan scanner I am moving to becoming a power user. I started scanning tablets that I take handwritten notes on. The scanning and evernote importation went flawlessly. Everything seemed perfect. But as near as I can tell the PDFs were simply not OCRed. My handwriting isn't perfect. But I test searched certain words that are clearly legible.

Ruling out other issues ...

Have now waited a couple days. So presumably not just a delay.

The PDFs range from 2 to 3 megs so and 10 to 15 pages. So presumably not being skipped from breaking those limits.

I have occasionally in the past taken photos of notes. And Evernote OCRs them very successfully, catches words where my handwriting is terrible etc.

I know there are rules about which PDFs en will and won't accept but I'm using the latest snapscan which has evernote integration built in (not the one actually from en market) but same model. So I would be very surprised if the PDFs are violating one of the rules for PDFs.

Following the suggestion of one of the en evangelists I actually turned off the snapscans own internal ocr capacity and was leaving it to be done on the EN ocr servers.

So, I feel like I've ruled out all the things people mention. I was about to just try using jpgs. But all the discussion I see here makes me think that even though he process is different, en OCRs PDFs quite well. So it seems like there must be something I don't know, something I'm doing wrong.

Can anyone help or point out what I'm not getting?

7 replies to this idea

Recommended Posts

Posted

Wow, I'm really surprised by that.  But it certainly explains my difficulty.  I feel like I've read numerous threads that talk about OCR of PDFs and particularly handwriting on PDFs.  In fact, this post from the tech blog doesn't seem to note this limitation.  

 

http://blog.evernote.com/tech/2013/07/18/how-evernotes-image-recognition-works/

 

(Maybe it's referenced there.  But if so, I didn't see it.)

 

To be clear, I'm not saying you're wrong.  I see the "evangelist" in your member status.  So it's pretty clear you're right.  I'm just surprised that I haven't seen this noted specifically.  Is it that the OCR just isn't powerful enough to effectively process handwriting or like it doesn't even try?

  • Level 5*
Posted

Wow, I'm really surprised by that.  But it certainly explains my difficulty.  I feel like I've read numerous threads that talk about OCR of PDFs and particularly handwriting on PDFs.  In fact, this post from the tech blog doesn't seem to note this limitation.  

 

http://blog.evernote.com/tech/2013/07/18/how-evernotes-image-recognition-works/

 

(Maybe it's referenced there.  But if so, I didn't see it.)

 

To be clear, I'm not saying you're wrong.  I see the "evangelist" in your member status.  So it's pretty clear you're right.  I'm just surprised that I haven't seen this noted specifically.  Is it that the OCR just isn't powerful enough to effectively process handwriting or like it doesn't even try?

 

"Evangelist" does not mean I am correct! I am often incorrect, and I think you went to the right place (the Evernote blog) to find the answers. In fact, the first paragraph tells you what you need to know. For PDFs, as far as I know, it doesn't even try.

 

When a note is sent to Evernote (via synchronization), any Resources included in the note that match the MIME types for PNG, JPG or GIF are sent to a different set of servers whose sole job is performing Optical Character Recognition (OCR) on the supplied image and report back with whatever it finds.

Posted

Ahhh, okay, I misread that.  I read that as listing examples of types it looks for not an exhaustive list.  But reading it again, I think you're reading it right.  Okay, mystery solved it seems.  Thanks so much for your help.  I'm looking forward to becoming part of this discussion community.  

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...