Jump to content

(Archived) Words in photo are searchable while others are not?


Recommended Posts

I took a photo of a student's class schedule and brought it into a Note. The OCR kicked in almost instantly, and when I search for market, caot, gis, adm, it finds them, however, for some reason it will not find the word history or multimd. Just curious if anyone knows why this might be? The words all have about the same level of sharpness to them and are all obviously part of the same photo taken with the same camera phone with the same lighting, etc. Is there some other search technique that needs to be employed with common words?

 

Thanks in advance.

post-114377-0-51226900-1362009997_thumb.

Link to comment
  • Level 5*

JPG and PNG pictures are OCR'd with serious Evernote-fu (Everfu?) to try and estimate the meaning of the words in the picture.  Getting the text out of a picture is a pretty good magic trick at which Evernote are amongst the best.  But it's not an exact or a very old science,  and each word has a little 'tree' of meaning attached to it.  "Horse" might be recognised but also tagged with 'house', 'mouse' or 'moose' just in case. 

 

The words you're missing may not have been recognised correctly and just have the wrong tags attached.  There's no way to see what they are or correct them.  While there's nothing you can do about it directly,  things will improve with time as the OCR algorithm is updated - but that's not going to happen overnight.

 

Saving documents in PDF format does give you the option to make them searchable,  which means there's a character-for-character translation on which to base an index.  You may have better luck with your searches if you use an app that saves snaps as PDF files - or in certain cases you may just have to retype the content.

Link to comment
  • Level 5*

If you want to see what words -- or more properly, word fragments -- were recognized, export the note to .ENEX format, and open the resultant file in a text editor. Way down at the bottom, you should find the <recoIndex> element, and the OCR info is contained there. For each position where there's suspected text, there's an <item> element, and in each of those is a series of <t> elements that contain the candidate string values. See the docs:  http://dev.evernote.com/documentation/cloud/chapters/image_recognition.php

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...