Jump to content

Recommended Posts

  • Level 5*
10 minutes ago, BlackCloud said:

Where is OCR done? On the PC or only when you sync?

The builtin Evernote feature is server based - you have to sync
Of course you have the option of OCRing your files outside of Evernote

>>Is the entire PDF OCRd? Let's say I have a 5 page PDF, all towns in alpha order. If I do a search for Williams, on page 5, will it be found?
Is this an OCR issue?  I assume they're not 
embedded images.  
My impression is this would just be a text search.
I think the answer is yes, but its a pain to locate the item within long notes

Link to post
  • Level 5*
7 minutes ago, BlackCloud said:

Where is OCR done? On the PC or only when you sync?

The OCR done by Evernote is done in the Evernote Cloud, so it does first require a sync.

8 minutes ago, BlackCloud said:

I am a Premium user. Is the entire PDF OCRd?

Yes, the entire PDF is OCR'd.

However, the Evernote OCR is NOT the standard OCR.  It's only purpose is to generate a search index, and it does NOT generate any text that you can select or extract from the PDF.

IMO, it is also best to OCR the PDF before attaching to Evernote.

I do all of my OCR using Adobe Acrobat XI Pro on the Mac, and have found it to be fast and very accurate.
Years ago I did some tests comparing Evernote OCR with Adobe OCR, and found Adobe to be more accurate, reliable.
The main reasons I OCR before attaching to Evernote :

  1. Adobe is more accurate than Evernote 
  2. As soon as I attach the PDF, Evernote indexes it almost immediately, and it is available for EN search
  3. The OCR text is available to copy in the PDF (unlike Evernote OCR).

Adobe Acrobat Pro is by far the most expensive choice of available PDF tools.  I bought it years ago, and have been on an upgrade path since.
If you search around, you can sometimes find the prior version highly discounted.
There are probably other PDF tools that do OCR very well that are a lot cheaper.  The best thing to do is search for "pdf tool review" and do your own research to find the tool that best fits your needs and budget.

Link to post
  • Level 5*

@BlackCloud

What is the source of the PDF?  It may be OCR'd when you get it.  And I'm not sure if EN OCRs a PDF that arrives already OCRd.  So syncing may or may not make any difference.  Anyone know what EN does with an already OCRd PDF?

Link to post

Maybe my understanding of OCR is in error. Won't be the first time...........

I understand OCR to be a index of all words contained in a document that can be recognized. When a search is done, this index is referenced and if the word appears that document is selected.This index is attached to each document.

A text search to me is a one off search of document(s) for a particular word.

Is my understanding correct?

 

Link to post
  • Level 5*
1 minute ago, BlackCloud said:

if the word appears that document is selected.

Evernote goes further, the text within the pdf is highlighted - I think this applies to handwriting too

Link to post
  • Level 5*

EN search does both at the same time, it searches for the text in notes and in the PDFs in the notes. 

EN also supports search within other attachment types for premium subscribers.

Link to post

Source is Adobe.Acrobat Reader. It is not OCRd. I did find out some of my problems with my PDFs not working was they were not inline and the tech had me change an option, solved most of the problems.

We are all crossing notes. I will need to go back to light testing as I believe I am only getting OCR on the first page of a PDF, maybe another option that needs to be flipped?

 

Link to post

PDFs are from PDF995, a program that shows in the printer listing and creates pdfs. I use it any place I can print from.

 

One more question(ha!!): When I have a pdf up I only get the first page until I click on it. That page shows things highlighted but no other pages. Desn't make any difference if I use Evernote PDF or Adobe.

Link to post
  • Level 5*
9 minutes ago, BlackCloud said:

I understand OCR to be a index of all words contained in a document that can be recognized

For the standard definition of OCR, see Optical character recognition 

The only PDFs that need to be OCR'd, are those which were created by scanning, which produces an image of the document.  If you received a PDF via email, or downloaded it from a web site, most likely it is text-based, and does not require OCR.

So, true OCR is NOT an index, it is conversion of images to text.  However, Evernote has perverted this definition, and produces only an index to be used for searching.  For details of the Evernote process, see How Evernote's Image and PDF Recognition Works 

14 minutes ago, csihilling said:

Anyone know what EN does with an already OCRd PDF?

Evernote takes the PDF and indexes it using the existing OCR.  The indexing, and therefore searching, is available almost immediately, whereas if you rely on Evernote to do the OCR, it may take a while, even several days.

Link to post
  • Level 5*
7 minutes ago, BlackCloud said:

PDFs are from PDF995, a program that shows in the printer listing and creates pdfs. I use it any place I can print from

Then you are creating a text-based PDF, which does NOT need to be OCR'd, unless you are printing an image.

Link to post

Now I am really confused: Here is my end goal- I have 5000 pdfs created in various ways, primarily with PDF995 in the 5-10 years, previous to that scanning. This is my collected Travel information. I would like to take the pdfs put them into EN and let it rip. If I type in the search "tree house" all references to "tree house" in any note will be found. Now that I know I am doing a text search my question from a previous note is still viable. Does the entire(all pages) get searched?  Why is only the first page displayed and highlighted?

I am slowly understanding the ins/outs of EN, uphill though. I feel my history of being in the computer programming business for 25 years may be muddying my learning curve.

 

Thanks for all the education

Link to post
  • Level 5*
3 minutes ago, BlackCloud said:

Does the entire(all pages) get searched?

I think I have already answered that question twice.  The answer is still yes.  If you don't believe me, just do a simple test to convince yourself.

4 minutes ago, BlackCloud said:

Why is only the first page displayed and highlighted?

Apparently that is just how EN Win works.  On EN Mac, the entire PDF can be displayed inline, and all found words are highlighted.  In fact, in EN Mac, it will take you to the first page, even if that is page# 25, that has the search term.

But you are probably better off double-clicking on the PDF to open it in Acrobat, and do the detailed FIND from there.

See Tips for searching scanned PDFs 

Link to post

I just did a test of my AZ pdf. Looked for Sedona. Didn't find any reference. I then searched for an item on the page displayed, nothing found. I looked at options and nothing stood out that I would help the search. These are all local folders on my PC.

Any ideas or should I email EN.

 

Link to post
  • Level 5*
10 minutes ago, BlackCloud said:

These are all local folders on my PC

Do you mean "local notebooks" ?   If so, they will NOT get indexed by Evernote because they are never sync'd to the EN Cloud.

Please just do a simple test:

  1. Select a multi-page text-based PDF (that does NOT need OCR)
  2. Attach that to a EN Note in in sync'd notebook
  3. Do a manual sync
  4. Wait a few minutes (an hour at the most)
  5. Do another manual sync to update your indexes.
  6. Select the Note and do a FIND for text that is NOT on the first page
  7. Does it find it?
    1. If it does not find it, open the PDF in Acrobat and confirm the text can be found using Acrobat.
  8. Select another note
  9. Do a SEARCH for the same text
  10. Does it find the Note with the PDF?
  11. Does it find the text in the PDF?

When all this is done, if you still have questions, please reports the results.
Also, if you are still unclear about the PDF/OCR process, please read thoroughly the references I have previously given you.

Link to post
  • Level 5*

@BlackCloud

I just ran a test in EN Win 5.9.6 on Win7 Pro x64, and EN Win and EN Mac handle PDF viewing quite differently.

Here's what I found in EN Win:

  1. Search for text using the Search box
  2. EN Win filters the note list to those Notes that contain the text, either in the Note of in a PDF of the Note
  3. Click on a Note with the PDF
  4. If the PDF is not shown inilne, goto Tools > Options > Note, and check "Always show PDF documents inline"
  5. Back on the Note, press CTRL-F to find, and it will find using the same text as you just searched for.
  6. The PDF will move to the first page with the found text.

Does this work for you?

Link to post

I haven't tried your second note. I had started your suggested process a bit earlier. I also checked with Acrobat and search did not work, I do not know what that means.

2nd note comments:

I do have the option for inline checked per tech support. With inline checked only the first page shows. When I do a search Florida.pdf(prior to the start of the actual pdf) is highlighted, nothing else.

 

Link to post
  • Level 5*
17 minutes ago, BlackCloud said:

I also checked with Acrobat and search did not work, I do not know what that means.

If Acrobat search fails, then Evernote search will also fail.

You need to resolve the issue with Acrobat search first.  Try choosing different words in the document.  
Are you sure it is text-based?  Using the Acrobat selection cursor (arrow), can you select text?

Link to post
  • Level 5*
4 minutes ago, BlackCloud said:

Yes, I can select text, just not search.

Then something is wrong with your PDF and/or Acrobat install.

Let's try this.  Here's a PDF file that is text-based, that a FIND works perfectly for me in Acrobat:
TEST PDF for Searching.pdf

Download it and see if Acrobat will FIND (CTRL-F) "obstacles" (without the quotes) in it.  Should be on last page.

Link to post
  • Level 5*
1 hour ago, BlackCloud said:

Don't understand why I am having a problem, some glitch. Thanks for all your help. I know I have been a pest.........

You're welcome.  The key thing is that you have now isolated the problem, so the solution should be forthcoming soon.
No worries.  I know how stressful and frustrating issues like this can be.

Link to post

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...