Jump to content
  • 0

(Archived) OCR not working


dea8

Idea

I am relatively new to Evernote. I am having a problem with some of my PDFs not being searchable. I uploaded 2 pdfs on November 30th and they are still not searchable. I have seen comments on the forum that suggest that sometimes it can take 8 hours or so but over a week? Anyone else have this problem?

Link to comment

21 replies to this idea

Recommended Posts

Huh? If I search ONLY on OCR, with no sub forums & no specified author & skip over the first few entries (that are from this thread), I get the "OCR totally broken" thread. So I'm not sure why you think I distorted anything. That's a very general search. Results are sorted by posting date with most current first. So yeah, there may be 38 pages of results...but just like with Google, the first ones are probably the most relevant/current.

The reason I used the term distorted is you only selected the most recent thread.

Sure, sometimes you get lucky and the results coincide with what you are looking for. That is when "the search function is your

I'm really not sure where you're going with this. In this case, one can do a general search (as shown above) & the first thread (excluding the one created by OP, which really doesn't count) answers OP's question. Your point about it being the most recent one is, IMO, a valid point b/c an answer from Dave two years ago on a particular topic may not be appropriate for the same question last week. So most recent thread is a good thing, IMO & that's what I go for.

If your point is this doesn't always work, fine. Yes, that's true. But not in this case. The answer was on page 1. I wasn't throwing up some obscure post from three years ago that showed up on page 56 when when searching on OCR.

And that's all I have to say on this. :(

Link to comment

So once an PDF (stipulated not searchable to begin with) is added to Evernote, Evernote does an OCR, and when that happens, the user can right-click and save a searchable version.

Question: Am I right in that the searchable version then has to be added to Evernote?

I say this because, the PDFs I added all have the 'Save Searchable version as...' menu option, which suggests they have been processed by Evernote, but searching for text that I know is inside these PDFs fails to identify the corresponding note. As a test, I saved a searchable version of one of the PDFs, created a new note, and added the searchable version, and now searching for text inside that PDF turns up the note with the searchable version, but not the note with the 'original' version of the PDF.

And am I correct in assuming that such a search should only identify the note and not highlight the text on the appropriate page(s) of the contained PDF, as is done with individual graphics images?

Cheers...

--

Alex Lane

"The only easy day was yesterday."

Link to comment
  • Level 5
If you right-click on the PDF within Evernote on Windows, there should be a menu option for "Save Searchable version as..."

If that's enabled, you can use that to save the OCR'ed version of the PDF document.

If that's disabled for the PDF, then it hasn't been processed for some reason.

I don't mean to be a nit-picker, but the "Save Searchable version as..." will not appear if your scanner does the OCR before sending to Evernote.

The "Save Searchable version as..." will only show up on the non-OCR'd PDF's that are submitted to Evernote.

At least, that is what happens to me.

Link to comment

If you right-click on the PDF within Evernote on Windows, there should be a menu option for "Save Searchable version as..."

If that's enabled, you can use that to save the OCR'ed version of the PDF document.

If that's disabled for the PDF, then it hasn't been processed for some reason.

Link to comment

All of the PDFs I've added have been added since converting to Premium near the beginning of the month. A PDF submitted two days ago does not appear to have been OCRed. The latest PDF was a one-pager added two hours ago, and it does not appear to have been OCRed, either.

Is there a way for Windows users to know if a document has been OCRed or not (by which I mean that an OCR attempt ended either successfully or abnormally)?

Cheers...

--

Alex Lane

"The only easy day was yesterday."

Link to comment

If you added a new scanned PDF to your account (100 pages or less), and you are Premium, then that PDF should be processed within our servers in an hour or two. (Really big PDFs can take that long to do, regardless of our queue length.)

It takes a lot longer for us to go through old PDFs when you convert to Premium. They'll be finished eventually, but your best results will come if you test a new scan added after you upgraded to Premium.

If you still can't search for that PDF on the web after 2-3 hours, then there may be some other issue. Try with a simpler, single-page scan.

Link to comment

Greetings, I am a returning user from long ago, and am trying out the Premium level before deciding on a year-long commitment.

I am curious about the OCR feature. I added a scanned PDF to a notebook a couple of days ago and it does not appear to have been OCRed. I added two more scanned PDFs about six hours ago, and they, too have not been OCRed. I added a single-page scanned PDF a little while ago, with apparently the same results. (I base these conclusions on the fact that searching for text inside of these files does not turn up the files.)

Any ideas as to what's not working?

(Also, as the icon in the notes area is for the URL, that leaves the question of where, if at all, is there an indication that a file has been OCRed in the Windows version of the app?)

Cheers...

--

Alex Lane

"The only easy day was yesterday."

Link to comment
  • Level 5
I don't know the details about that icon, but I'm guessing that maybe it's the "world" in "world wide web".

I think you are correct. The two blobs represent North America and South America.

Link to comment
  • Level 5
Premium should be OCR'd in just a few minutes.

Evernote has added a non-documented secret icon that only a few people know about. I don't have a mac, but another user commented:

  • "on the Mac client, there's an icon just to the right of the tag area that tells you whether your image has been OCR-ed or not."

On the Windows version, there is a small circular icon in the upper right corner of the Note Panel. I believe the color or hue is supposed to change if the document has been OCR'd. The icon has two blobs inside the circle with a small line connecting the blobs. I guess it is supposed to symbolize something. Your guess is as good as mine. There is no clue offered by Evernote - nothing happens when you mouse over it or click on it. Just another one of those Evernote Easter-Egg surprises we have come to love and expect.

I found the answer to what the round circle is:

It is called a blue globe icon and it opens that note's URL

viewtopic.php?f=56&t=21141#p89549

Link to comment
  • Level 5

Huh? If I search ONLY on OCR, with no sub forums & no specified author & skip over the first few entries (that are from this thread), I get the "OCR totally broken" thread. So I'm not sure why you think I distorted anything. That's a very general search. Results are sorted by posting date with most current first. So yeah, there may be 38 pages of results...but just like with Google, the first ones are probably the most relevant/current.

The reason I used the term distorted is you only selected the most recent thread.

Sure, sometimes you get lucky and the results coincide with what you are looking for. That is when "the search function is your friend".

And using the obviously circular logic you can prove that the first ones are the most current. Most relevant? Sometimes, sometimes not.

But if you are looking for an explanation on why the "Notebook" Search feature no longer works, you get buried with 45 pages of possibilities. I am not saying the answer is not there. I am not saying the answer is not on the first, second , or third page. But I would not say that it is my friend especially to a new user of the forum.

Link to comment

So if you search all boards, no particular author on "OCR", the first thread after this one is:

viewtopic.php?f=38&t=20773&p=87413&hilit=ocr#p87413

You have to admit that you distorted the results by restricting the results to just a single thread "OCR totally broken?"

Sometimes the first hit might be the answer. Many times it is not and requires a lot more digging.

When I search for just OCR, I come up with 38 full pages of hits with countless numbers of posts on OCR.

That is the search result noise I was referring to.

Huh? If I search ONLY on OCR, with no sub forums & no specified author & skip over the first few entries (that are from this thread), I get the "OCR totally broken" thread. So I'm not sure why you think I distorted anything. That's a very general search. Results are sorted by posting date with most current first. So yeah, there may be 38 pages of results...but just like with Google, the first ones are probably the most relevant/current.

Link to comment
  • Level 5

So if you search all boards, no particular author on "OCR", the first thread after this one is:

viewtopic.php?f=38&t=20773&p=87413&hilit=ocr#p87413

You have to admit that you distorted the results by restricting the results to just a single thread "OCR totally broken?"

Sometimes the first hit might be the answer. Many times it is not and requires a lot more digging.

When I search for just OCR, I come up with 38 full pages of hits with countless numbers of posts on OCR.

That is the search result noise I was referring to.

Link to comment

Sometimes, I feel the search function is my enemy.

I would never have considered using the term "unfortunately" as a search term.

If you follow the forum closely (as you and I do), one might remember some of these terms, But for a new person, who is searching for OCR or PDF or Tag or Notebook, the results can be overwhelming.

And adding a 2nd term frequently increases the amount of search results "noise".

So if you search all boards, no particular author on "OCR", the first thread after this one is:

viewtopic.php?f=38&t=20773&p=87413&hilit=ocr#p87413

Link to comment
Evernote has added a non-documented secret icon that only a few people know about. I don't have a mac, but another user commented:

  • "on the Mac client, there's an icon just to the right of the tag area that tells you whether your image has been OCR-ed or not."

Yep, that would have been me. To clarify, such an icon only appears if a note contains material that the Evernote servers will do OCR on ... for me, since I have a free account it's just image files. I assume if I were a paid user, the icon would appear on pdf's, as well. Before the OCR is completed, if I mouse over the icon I get the following text:

"Not all resources on this note are indexed for searching."

The icon changes after the OCR is done, and the mouse-over changes to, "Resources on this note are indexed."

Anyhow, my suggestion would be re-upload the errant PDF's, and see if you get the indexing icon. Then you can monitor it to see when indexing takes place.

Link to comment
  • Level 5

Premium should be OCR'd in just a few minutes.

Evernote has added a non-documented secret icon that only a few people know about. I don't have a mac, but another user commented:

  • "on the Mac client, there's an icon just to the right of the tag area that tells you whether your image has been OCR-ed or not."

On the Windows version, there is a small circular icon in the upper right corner of the Note Panel. I believe the color or hue is supposed to change if the document has been OCR'd. The icon has two blobs inside the circle with a small line connecting the blobs. I guess it is supposed to symbolize something. Your guess is as good as mine. There is no clue offered by Evernote - nothing happens when you mouse over it or click on it. Just another one of those Evernote Easter-Egg surprises we have come to love and expect.

Link to comment
  • Level 5

Sometimes, I feel the search function is my enemy.

I would never have considered using the term "unfortunately" as a search term.

If you follow the forum closely (as you and I do), one might remember some of these terms, But for a new person, who is searching for OCR or PDF or Tag or Notebook, the results can be overwhelming.

And adding a 2nd term frequently increases the amount of search results "noise".

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...