Jump to content

Is Premium required for Scannable OCR support?


Go to solution Solved by klang,

Recommended Posts

  • 2 weeks later...

Thanks for your reply.

 

This Evernote blog post:

 

https://blog.evernote.com/tech/2013/07/18/how-evernotes-image-recognition-works/

 

says that OCR is performed on uploaded images, and most PDFs.

 

While PDFs from Scannable do not OCR, I've discovered that if I send a jpg from Scannable to my Camera Roll, and then send that image to Evernote, I do get OCR. PDFs which I bring in from sources other than Scannable are OCRing as well.

 

Can you help me understand why PDFs coming from Scannable do not have parity with this functionality? Evernote already possesses this capability, so is it just concern over the capacity of the infrastructure to handle the (presumably) higher load of documents needing to be analyzed?

 

Peter

Link to post

Good questions, all.  It is our intention that PDFs uploaded from Scannable follow the same rules as PDFs uploaded from other sources in Evernote.  If that's not the case, it could be one of two things:

 

1. There is a restriction for the file size of PDFs that are queued for OCR - I believe it's 25MB

2. Something isn't working correctly

 

If the case is #2, I would recommend that you open a support ticket and refer to this thread and we'll see what we can do.

 

P.J.

Link to post

I just submitted a support ticket precisely for this reason: documents (PDF) I scan with Scanabble should be indexed and OCRd within Evernote after it has been imported (for premium accounts, like mine) but this is not occurring.

Searching for content in any of these documents that come from Scannable in Evernote returns zero results, even after allowing Evernote 2 whole days to perform its background indexing processes.

Checkout an example note I uploaded.

post-134118-0-48548300-1422943410_thumb.

Link to post

I continue to see the results I mentioned above; should I submit a ticket too, klang?

 

Also, the "Best Answer" above says that OCR is not supported for PDFs. It seems this tag should be removed for now, both because this is not resolved and because I believe the answer is inconsistent with expected behavior.

 

Peter

  • Like 1
Link to post

Some clarifications:

 

Currently Evernote iOS does not support search-within-PDF if the PDF is an image-based PDF, like the ones Scannable produces. I'll forward this chat to that team. 

 

However searches across all your documents will find a PDF from Scannable if it contains a word that OCR has detected. For example in this case, if you search for Volkswagen, even if this *only* appears in the PDF image, then this will currently be found. 

 

Note that this OCR can take some time, because it's done on the server. It's typical to get back results in minutes for PDFs of a few pages. 

Link to post
  • 3 weeks later...
  • 4 weeks later...
  • 6 months later...

Hi, my first post : ) I scann every bill and invoice, which is great. The specific invoice number is searchable - perfect! But only on Evernote my laptop...? The same invoice/invoice number is not found on my Evernote account on my mobile devices...? (!) Why is that? They are stored in the same place in Evernote.  

Link to post
  • 6 months later...

Just took me a few hours to find out that scannable differentiates between 1 and 1+ pages, thereby creating two types of content, which result in OCR and non-OCR. 

Having hundreds of scannable PDF documents in my Evernote...

Here's the billion-dollar-question: If I upgrade to premium NOW, does EN

1. recursively scan my existing documents/content for PDFs and OCR's them (recursively also), so that they become searchable from this point forward? Or, 

2. is OCR-searchability limited to documents I introduce/upload to EN from this point (of signup to premiums) forward? 

Link to post
  • 3 years later...

Resurrecting this old request as it is March of 2020 and PDFs that come in through the Scannable app are still only sort of OCR'd (OCR on an image is different than OCR on a PDF in that you can highlight and select the text on the latter).  And this is a different use case for Scannable.

I love the Scannable app, but I am trying to figure out how to turn the photos (which become a PDF saved to my Evernote) I take into actual text that can be read by a screenreader on my iPhone. It's the poor-man's audiobook for those books that just don't come in the read-aloud format (or even as a Kindle which you can have Alexa read to you out loud). For most photos and PDFs, it will OCR the text and make it searchable in Evernote, but because Scannable will not convert the image-based PDF to a text-selectable PDF, I cannot get the saved PDF to be recognized as actual text ("no speakable content found") even if I save it to my iCloud Drive / Files.  This is the same if I save it to my Camera Roll.  And Evernote does not convert it either.

The workaround is to take the generated PDF from Scannable and process it through Adobe Acrobat or an online service to make it into a text-selectable PDF.  Oh and I am Premium so that makes no difference.  

The strangest part of this is that Evernote will OCR all text in photos, and it becomes searchable, but you cannot select the text in a photo, so screen-readers cannot access the words.

Link to post
  • Level 5*
1 hour ago, owe-me said:

PDFs that come in through the Scannable app are still only sort of OCR'd

All PDF's are processed the same by Evernote.

>>but you cannot select the text in a photo, so screen-readers cannot access the words.

Evernote's OCR feature is used for the purpose of search indexing   
The photo/pdf is not modified with retrievable text; it's stored in a separate text file

Link to post
  • Level 5

From my experience, placing a pdf that is already OCRed into EN works better than having it OCRed by EN. The external OCR must do a very good job, because EN will not OCR these documents again.

When OCRed outside of EN, the text information is embedded and practically overplayed to the picture information. This makes search more comfortable. Search will work as well when the document is exported from EN for whatever reason.

For me the internal OCR of EN is sort of a backup for having a 100% searchability of my notes. It kicks in when I use the build-in scanning option of EN on my iPhone, or file a normal picture into a note. Because of making sure that every piece of information is retrievable, the OCR function has a very high value for me even when most of my documents come pre-OCRed.

Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...