Jump to content

Searching inside already-OCRed pdfs with Free plan


Recommended Posts

I've been using Evernote for 15 years now, mostly on the Free plan but occasionally on a higher plan as and when needed.  At present I'm on the Free plan.

When I scan my paper documents for Evernote, I always perform an OCR on these before adding to Evernote, so they already have a text layer.  Therefore, I'm not relying on Evernote to do the OCR.

I'm sure it used to be the case that you could search within these pdfs even under the Free plan, i.e. on pdfs that were OCRed before importing into Evernote?  I know that an upgrade was needed for searching inside pure image-based pdfs, but all my pdfs have been OCRed in advance. 

Did this change?  When?  A storage system without any means to search within the storage system is completely useless!

So far as the search is concerned, these documents are text-based, not image-based, and we shouldn't need an upgrade for a text-based search?  I don't care if the search terms are not highlighted in the filtered documents, I just need to be able to find documents that contain the search terms.

Link to comment

It definitely used to be possible.  And it still is, at least up to a certain date.  I can enter search terms now and find notes (pdfs) that match those search terms, going back to 2008.  But I'm not seeing anything imported after around 2018. 

This would be really disappointing if I can't find anything imported since then, without paying a monthly fee!  Maybe time to consider other solutions.

Link to comment

Even that's not particularly helpful, because it doesn't distinguish between pdfs that already contain a text layer, and those that don't.  IMHO, if the document is text-based, which an OCRed pdf effectively is, it should be searchable regardless of plan.

And it refers to "Evernote Premium and Business only" when the only options I see are "Personal and Professional".  Surely the least they can do is keep their help pages current?

Link to comment
  • Level 5

Just tried it myself, on a plain FREE account that was never on a subscription, web client. The results were mixed:

It seems to find something, because it jumps to a note, and even highlights the hits in the text layer of a pdf. 

But it says as well "0 Notes found".

To me this looks that the text layer plays a role, but it will not serve as a platform for search. True search in pdf, including a list of search hits requires a subscription. It is up to the company which features are available on which Plan level. It is standard that subscribers get some useful features in exchange for financing the whole show. This means as well that users below that level will not get the access, or only as a teaser.

It is up to the user to decide about his actions: Subscribe, don't subscribe and go without or leave the service to another that has a better feature set. If you only want to search for file content, usually the OS based search on Mac and Windows will do. You don't have the comfort of having everything embedded into a note, with all additional options. If this does the job for you, you need to know yourself.

Link to comment

@PinkElephant what do you mean by "seems to find something, because it jumps to a note"? 

Do you mean when doing a find on the web page using the browser's "find on page" feature, and with the relevant note (with pdf) already selected (i.e. actually visible on the web page)?  Very limited utility in doing a text search within a document on a web page that you've already located (from 15,000 stored documents... needle in a haystack stuff).

Yep, am fully aware that it's up to the company to decide on the feature set for each plan, the disappointing thing is that they seem to have changed this after already having used the service for 10 years.

Basic search features should IMHO be free.  Yes, by all means limit the Free plan by storage amount (either in total, or per month like they do now), but don't restrict something so fundamental as "search by text" once a user has already committed 10 years worth of data into your platform.

Link to comment
  • Level 5*
12 minutes ago, drmrbrewer said:

Basic search features should IMHO be free.

Hmmn.  I can only refer you to Support for definitive answers;  Like others I always OCR my own documents,  so I'd have to do some testing to verify whether or not searches on unprocessed notes are possible for free users,  and of course it depends on when the document was added,  where you are (network issues) when searching,  and content (size/ image/ compatibility for OCR).  Certainly it's been my understanding that pre-OCR'd documents are searchable.

Also - do you keep more than one document in a note?  I have noticed that a PDF search will show you to a note containing a hit,  but if there's more than one PDF it will leave you to find out which one is the actual target.  A note containing a single PDF will highlight the hit wotrds or phrases.

Link to comment
7 minutes ago, gazumped said:

Certainly it's been my understanding that pre-OCR'd documents are searchable.

Yes, that's been my understanding from the start... hence why I've always OCRed my own pdfs before importing into Evernote.

7 minutes ago, gazumped said:

Like others I always OCR my own documents,  so I'd have to do some testing to verify whether or not searches on unprocessed notes are possible for free users

Would be really interested to see whether you can indeed verify this.  Or whether it's just me... maybe the indexing of my DB has gone awry, and hence why I'm not finding any pdfs post 2018 or so.

7 minutes ago, gazumped said:

do you keep more than one document in a note?

Not generally, and certainly not for these pdfs.  I'm using the auto import feature, which just creates a single note for each imported pdf.

7 minutes ago, gazumped said:

A note containing a single PDF will highlight the hit wotrds or phrases.

Yes, and it still does for me... but only for notes/pdfs imported pre-2018... nothing at all shows up in the search list post-2018.

Link to comment
  • Level 5

I mean exactly what I say: In a Free account, a pdf with a text layer was added to a note. Waited a little, to let all indexes update.

When entering a search term from the text layer into the search field, EN shows that note and highlights the searched term in the document.

But it says „0 notes found“, even when obviously it has found something.

This means a note will probably show, but you don’t get a list of search results on a Free  account.

You can easily try yourself.

Link to comment
1 hour ago, PinkElephant said:

You can easily try yourself.

I already did, but was confused by your observations, hence asking for clarification.  Because when I search on a phrase that I know for sure is in a recently-imported pdf (e.g. "March 2023"), then I get "0 notes found" / "No notes found"... and because there are no notes found, the list is empty, and no notes are shown.

So I don't understand your observation that "EN shows that note and highlights the searched term in the document" at the same time as saying "no notes found" (empty list).

This is what I see:

image.png.d3f6b70545a5a414dee376ceac5caeea.png

Link to comment
  • Level 5

Searching for a phrase is not the same than searching for a keyword. I would try with a keyword first.

And beside these little details: Search in PDFs is a subscribers feature - what may work on Free could be accidental (= not reliable).

Link to comment
41 minutes ago, PinkElephant said:

Searching for a phrase is not the same than searching for a keyword

What do you mean by searching for a keyword?  What is a "keyword" in the context of Evernote?  I see only tags and notebooks, not keywords.  Where do you search for a keyword?  If by keyword you just mean entering a single word in the same search field (or several, but without quotes), I can't see why it would display any notes when it says "no notes found", regardless of whether you searched on one word or several or several inside quotes.

Link to comment

@PinkElephant still not sure there's a difference between a "keyword" and a "phrase" though, in the context of Evernote?  Isn't a "phrase" just a keyword with one or more spaces in it?  Not sure why you drew that distinction in the first place and why it's relevant here.

Link to comment
  • Level 5

I didn’t make a difference - actually you came up asking. I just said if I want to check if anything works, it is better to keep it simple. Searching for one unique word is simpler, than searching for a string of words.

Since a search index is build from single words, any combination of words adds one more layer to a search request.

But anyhow, since I am on a subscription, the issue is of a more academic importance for me. I learned that the search works on the pdf text level, but doesn’t produce a search hit list of notes when on a Free plan.

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...