Jump to content

Evernote OCR and monospaced font


Recommended Posts

Hi all,

 

I started uploading some code snippets to some of my notes, as image files (using windows snipping tool I just select the region, copy and paste into the note).
Now I'm using evernote's search tool to look for some specific words that are nowhere except on those code snippets hoping that the OCR can find those and take me to the respective notes.

 

It's not finding the search terms. 

 

One example of such image files with code is the following:

 

Capture.jpg

 

(note: it does not look as blurred in evernote as here)

 

I'm aware it takes some time for the OCR to be done for non-premium users (which is my case), as discussed in this topic. However I've waited 6 days, retried the search, and no results were found again.

 

I'm wondering if the problem is the use of monospaced font since I've uploaded some images with other font types and the OCR works.

 

Thanks for the help,

Fabio

 

 

Link to comment

There are a number of variables at play:

 

  • To narrow down the troubleshooting, take a look at your note information and see whether all image attachment are in fact indexed.
    • If not, try holding down "Ctrl" and clicking on the "Help" menu in the uppermost toolbar. Then click on "Fix Current Note". That may work. After 6 days, for sure, your note should have been indexed. You can also check the web client to verify. 
  • Even if the text in the image is of a recognizably good resolution, sometimes a slightly higher image resolution yields better/ more accurate results - as evidenced when I zoom in on a web browser before clipping with the Windows snipping tool. There is more accuracy with OCR then. I've tested this quite extensively. Try tinkering with a better resolution - however, I'm not sure whether you can get a better resolution with the snipping tool... but perhaps, depending on the application your code is in, you may be able to zoom in before snipping. 
  • In my experience the font type you have in your image should not present any problems in itself.
  • Could you give an example of an exact search phrase you are using? 
  • Make sure you're using the search bar and not Ctrl + F to search OCR

 

I just downloaded and indexed the image you posted above. Although it does recognize words, it has difficulty finding certain entire strings... 

  • i.e. Evernote search doesn't recognize underscores and totally ignores periods and other symbols. In fact, if I type the following into my search bar, I get 24,083 matches (all of the notes in my account):   (.(=%(.((.%%((=))..)=)=%%%)).)&*$##$%^%&).))) ... so it's not only an OCR issue
  • If I type in twix_obj.image.dataDims;:
    • the underscore is not recognized and so will not retrieve your note as a match. If I exclude the underscore I find that particular string.
    • I can replace the periods with almost any and as many symbols as I wish, and the note will still show (even wrapped in quotes):
      •   "obj%image*&^%dataDims"  gives me a match on your image in a note
  • Although for OCRed images one would be able to find text through the search bar, one cannot then use Ctrl + F to search within a specific note you have in mind. Your search phrase has to come up as a match according to the search syntax you entered. You can easily  isolate your note through the "intitle:" operator if it has a unique title... and then continue to enter your keywords in the search bar, which only then appears to search within your note matched.

I wouldn't recommend taking screenshots of your code snippets. There are better ways to capture and search them, especially if you'd like to copy them at some point down the road. I would recommend simply selecting your code snippet and importing that into Evernote. There are many ways to achieve this. Here are 2:

  • Select your code then use the "Win + A" shortcut which automatically pastes your selection into a new Evernote note on your local machine
  • If you're in your browser you can also use the above method or alternatively select your code and hit the Evernote web clipper icon, which will copy your selection to the EN servers. I like this option in that you can select your notebook and/ or tags along with some other options.

Good luck!

Link to comment

(@Frank.dg)

 

Thanks a lot for the comprehensive answer.

 

  • In one of my notes with a similar code snippet, the images were not indexed. The "Ctrl" > "Help" > "Fix Current Note" changed the status to indexed. After a few seconds the OCR started to work.

Take the following image as an example:

Capture.jpg

 

If I search for "ASL" it finds this note and highlights each of the 4 strings (except the "298:" for the first line, etc). If i search for "BSInvTime", the search gives no results.

 

I do copy and paste code snippets as text as you suggest into evernote when I'm in my code editor since the formatting remains just like in the editor. However sometimes I want to copy some snippets from the command line and when I do copy them as text, they lose their formatting, the font changes, etc... 

 

In any case, the OCR seems to be doing a good job in many cases so I guess my problem was that many of my images were not indexed. For the images that are not indexed after the "Ctrl" > "Help" > "Fix Current Note", do I just have to wait and they will eventually be indexed or is there any command/action I can do to "force" them to be indexed?

 

Thanks for your time.

Link to comment

Hey @progninja,

 

Unfortunately no amount of waiting/ syncing is going to re-index your attachments after a note has been tinkered with. By the way, I do not recommend hitting "Fix All Notes" from the Help menu. That froze my system and caused me untold hassles.

 

One way to make sure everything is indexed would be to do a fresh install and sync from the EN servers after deleting your EN data files. That's one sure way to get everything re-indexed. The thing is that attachments not indexed will still be searchable on web and mobile, which directly search the EN servers. There seems to be a problem specifically with the Windows desktop client... and consequently, I imagine, that's why the Windows client has those "hidden" diagnostic and repair tools that Mac does not. 

 

I discovered that the problem arises when one edits or makes any change whatsoever to an already indexed note. It bumps the attachments out of their indexed status on your local machine. That's why search results may show discrepancies for OCRed attachments on Windows desktop and other platforms... i.e. Windows desktop will only show you partial search results, depending on how many of your notes with PDFs and images have been tagged, moved to a new notebook, annotated or edited in any way. 

 

Please take a look at this thread for more details:

https://discussion.evernote.com/topic/79777-photos-of-handwritten-docs-not-searchable/

 

I also included a video in my second post in that thread.

Link to comment
  • Level 5*

Hi all,

 

I started uploading some code snippets to some of my notes, as image files (using windows snipping tool I just select the region, copy and paste into the note).

Now I'm using evernote's search tool to look for some specific words that are nowhere except on those code snippets hoping that the OCR can find those and take me to the respective notes.

 

It's not finding the search terms. 

 

 

As you have found there are a number of issues with relying on OCR of complex images.

 

Why not just copy the code from your code editor and paste into Evernote (normal paste as HTML text?

This should greatly improve your EN Searches.

 

I just did this for some AppleScript code and EN Mac 6.0.3, and it retained the font, formatting, and syntax coloring that I had in the AppleScript Editor.  I don't know if this will work for all code editors, but you might give it a try.  Also, EN Win may work differently.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...