Jump to content

Existing pdf and jpg documents get OCR and hence become searchable when I upgrade to 'Personal' - is this correct?


Recommended Posts

I am somwhat peeved that OCR was dropped out of Evernote Plus, but I guess you want me to upgrade to Evernote Personal just to retain that functionality.

When I upgrade to personal, do my existing pdfs become searchable via OCR?  I assume 'yes' - but want to check my assumption.

Link to comment
  • Evernote Expert

I'm not a Plus subscriber but my understanding that no features were to be removed from any plan. So if Plus had the ability to search inside images and PDFs then it should still be available. Evernote doesn't, technically, provide OCR but it is a label that is often used by users. The feature is searching within an image/PDF. The text can't be lifted out of the image and copied elsewhere.

In Personal all images/PDFs are searchable. If, for some reason, it isn't then that is an error which should be able to be fixed. 

Link to comment
  • Level 5

Sorry, search in pdf and office documents was never part of the Plus plan. It started with the old Premium plans, for which it is still a feature.

Free/Basic and Plus are able to search in pictures.

Maybe search in pdf was possible when the pdf came with a text layer - as do most PDFs one can download or create from Office documents. Scanned pdf may have a text layer, if the OCR was done using another program. Not sure about it - anybody on these plans can upload such a pdf, and give it a try.

AFAIK all uploads get indexed - they are just excluded from the search results when on a incompatible plan. But this is my impression - to be sure ask support.

  • Like 2
Link to comment

This is not correct.  I have only ever had Plus.  The feature existed in basic scanned pdf using e.g. Fujitsu scansnap.  It was removed without fanfare a couple of years back, roughly.  This is not nice of Evernote but that is how it is - they can do as they please.

To be clear: Plus is an 'old' tier and no longer available, used to have a nice mix of features at a decent price... even Jill Duffy agreed on that score.

This link shows search in pdf was not/is not in Plus, as Pink Elephant says:

However this info is incomplete, as PinkElephant says, it depends on the 'text layer'.  I have many examples scanned without OCR AFAIK on Fujitsu scansnap which show on a Plus account (the only sort I have ever had):

  • it used to be searchable inside all pdf including  'old style image-scanned' pdf
  • it is no longer searchable inside 'old style image-scanned' pdf ('no text layer' ?) - this feature has been removed from Plus (I had a response on this topic from someone but cannot find it sorry)
  • it is still searchable inside character-based pdf ('with text layer'?)

An examples are two library docs loaded to a note, doc not scanned by me.  The snip below show a search on 'Holly'. it is found and hence highlighted (a) in title (b) inside the pdf.  This continues to work on Holly - despite the reference/link above, but makes sense with what PinkElephant says if it has a 'text layer' (I guess it does).This continues to work as shown (search on 'Holly'):

image.png.15985d128ee14eb18495854e7f0d852d.png

 

The following no longer works as shown for a search on fama 1992 XLVII (as it did in 2018, see edit date) for a new upload/note of the same pdf in 2022 - this feature has been deleted from Evernote Plus (shame on Evernote)

image.png.41ff9a159b28e508e56f9adc5912b09f.png

 


 

Link to comment
  • Level 5

Sorry, if you read through this rather old thread (2015/2016) there is even a table in it showing the same status as of today: 

832640064_SCR-20220330-h55-2Klein.png.b21cc57ca22c2e0e99a5a2c5c617e1f5.png

 

Then there is an extended discussion about what can be searched, what not and about the plans.

The only picture showing that pdf will be searchable with plus is a screenshot from an Android phone with some marketing text. Probably the intern mixed it up when the text was crafted - it runs opposite to all official EN documents I could find. So anybody who signed up to Plus back then on an Android phone can go ahead and sue EN for not delivering (pretty narrow claim ...).

It may be possible to search in a pdf that contains a text layer when it is uploaded. I can't try, don't have a Plus account.

To understand this: A pdf originally was created to make documents printable on all devices. It has several layers, but the main layer is a graphical layer. To make text extractable it needs to be placed in an independent text layer. For this, the pdf itself needs to be changed.

EN does it differently: It will not touch the original pdf. Instead when it does the OCR, the results are saved in an invisible section of the note holding the pdf. So a pdf can be searchable when holding a text layer, without the EN OCR result stored in the note. EN does AFIAK not OCR any pdf again that already comes preloaded with a text layer.

Link to comment

Err... I guess you did not read my post.

  • I referenced the very same post in the very same thread myself.
  • I gave an example, complete with snip/screenshot, of a third party pdf (Fama & French 1992) with no 'text layer' that used to be searchable (and indeed still is) in my Evernote Plus.  It is no longer searchable if uploaded last week, for instance.  But the old upload under Plus still is searchable.  And that is true of uploads done well after the thread we both referenced.
  • You say "It may be possible to search in a pdf that contains a text layer when it is uploaded. I can't try, don't have a Plus account." - but I provided an example, done today (Holly 2011).  Yes, it is possible in that case.

I have many hundreds of examples the same: 3rd party pdf and my own scans, no text layer, searchable under Evernote Plus.

So you want to blame the intern?   Hilarious.

No, the unicorn-intern was right: image-only pdf was searchable under plus until about two years ago.  Then it was taken away.  No kidding.

 

Link to comment
  • Level 5

Maybe they had some misconfigured server for a while. Whoever knows, the control of the abilities of the different account types is plain software based - not by natural law. Things can go wrong, happens all of the time.

Officially in the feature table the Plus account is not eligible for search in pdf documents. From the thread you posted it was the same in 2016 already, and I doubt they did ever change it. As long as I observe it, EN has always honored its self applied obligation to keep the old subscriptions alive, with an unchanged feature set.

So if it worked for a while, be happy when it did. You maybe benefited from a configuration error. It is over when it’s over.

To get the OCR based search in PDFs officially, you need to subscribe, at least to a Personal subscription. It is that simple, and all the talk about how it works and what it does in the different cases is not relevant for this simple piece of information.

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...