Cbiccum 0 Posted June 24, 2014 Share Posted June 24, 2014 Hi all, I'm a premium member now if that makes a difference. Before I start scanning my whole life I'm doing some testing and I found this a bit troubling. I scanned the following wine bottle (good wine)https://www.evernote.com/shard/s3/sh/01e06fd5-84f2-432d-bfb6-27602551054b/f56883009dbe3c21b806e79bbc27317d And when I search in evernote for "Merlot" nothing comes up in the search. But if I search for anything else on the label like "laundry" or "cabernet" or "Okanagan" this label does come up. Why won't it find Merlot? Link to comment
ScottLougheed 1,316 Posted June 24, 2014 Share Posted June 24, 2014 puzzling. I loaded that image into a note on my own premium evernote account. The desktop client on my Mac is saying the image has yet to be indexed (its been about 10 minutes, multiple syncs....) and as a result, it is not searchable. Vexingly, if I log into the web interface and search, for example, for "laundry", sure enough, it finds the text in the image! So it IS in fact, indexed! What version of the Evernote client are you using? Link to comment
Cbiccum 0 Posted June 24, 2014 Author Share Posted June 24, 2014 My osx version is 5.5.1 Link to comment
ScottLougheed 1,316 Posted June 24, 2014 Share Posted June 24, 2014 Ok, I am using 5.6.0 Public Beta 1. Yet we are both experiencing issues.... When you log into the web interface and search for a word like "laundry" or "dirty" (limiting the search in a sensible way so that you don't get piles of extraneous results), does it detect those words in the image, as it did for me? The version we are running on our desktop should have no bearing on what we see in the web interface. Link to comment
Sentinel 195 Posted June 24, 2014 Share Posted June 24, 2014 Scott, my guess is a photograph may OCR better in this case than a flat scanner, considering the curvature of the bottle. Link to comment
ScottLougheed 1,316 Posted June 24, 2014 Share Posted June 24, 2014 Scott, my guess is a photograph may OCR better in this case than a flat scanner, considering the curvature of the bottle.Hmmmm but OCR seems to be successful given that it turns up in search results just fine in the web client. It just seems like the desktop client isn't getting the message that the attachment is in fact indexed! Also, even this, comparatively worse photograph of a wine bottle got OCR'd just fine!https://www.evernote.com/shard/s25/sh/eb5fc921-79f2-41d1-bb27-e3cb8000114c/8dff5e8e44b14fffa8bfb0909c15068d Link to comment
Cbiccum 0 Posted June 24, 2014 Author Share Posted June 24, 2014 Guys FYI, I took the photo with my iphone and I think I had it on "document" mode. Scott yes when I search for dirty, or laundry I get the highlighted text on the label that it found it, yet it won't find "merlot" As you can see merlot is a bit wrapped around but the picture is very clear and to me shouldn't be an issue. This said, how long should I expect Evernote to take to crunch the data when I upload a photo/document? Link to comment
Level 5 Adjusting 276 Posted June 24, 2014 Level 5 Share Posted June 24, 2014 I think the curvature of the bottle and the spacing of the letters are combining forces to confuse the OCR.The space between the E and the R is a particular problem.If you search for 'me riot' (without quotes), you'll see that merlot is highlighted. Link to comment
ScottLougheed 1,316 Posted June 24, 2014 Share Posted June 24, 2014 Guys FYI, I took the photo with my iphone and I think I had it on "document" mode. Scott yes when I search for dirty, or laundry I get the highlighted text on the label that it found it, yet it won't find "merlot" As you can see merlot is a bit wrapped around but the picture is very clear and to me shouldn't be an issue. This said, how long should I expect Evernote to take to crunch the data when I upload a photo/document?1) I too, am unable to get "2012" or "merlot" to show up in a search. Odd since the characters that comprise those words are fairly clear. 2) Well, it should be a matter of minutes. If we go by the results in the web client I think this was OCR'd within 5 minutes (at least, it was about 5 minutes between me adding it to my own Evernote account and me trying the web client). The time to OCR will depend on the overall server load and whether you are free or premium. Premium users are pushed ahead on the OCR queue and so should see OCR results faster than free users. That being said, when things are working, its a matter of minutes, (sync up, a few minutes, sync the OCR data back down) to get results. The trouble is, it seems like the desktop client isn't recognizing that the image has been OCR'd. This is unrelated to how quickly the image gets OCRd. Link to comment
ScottLougheed 1,316 Posted June 24, 2014 Share Posted June 24, 2014 I think the curvature of the bottle and the spacing of the letters are combining forces to confuse the OCR.The space between the E and the R is a particular problem.If you search for 'me riot' (without quotes), you'll see that merlot is highlighted.Adjusting, any insight into why the desktop client might persist in saying the image has not been indexed? Link to comment
Level 5* jefito 5,586 Posted June 24, 2014 Level 5* Share Posted June 24, 2014 You can find out for sure (at least for image OCRs) by exporting to Evernote format, and searching for the desired text among the 'recoText' items. If it's not there, the OCR didn't recognize it, for whatever reason (fuzzy image, spacing, and other factors) Link to comment
ScottLougheed 1,316 Posted June 24, 2014 Share Posted June 24, 2014 You can find out for sure (at least for image OCRs) by exporting to Evernote format, and searching for the desired text among the 'recoText' items. If it's not there, the OCR didn't recognize it, for whatever reason (fuzzy image, spacing, and other factors)but the strangeness is that it is OCR'd, because searching using the web client returns (limited) results. It just seems like the desktop client isn't recognizing that the image has been OCR'd. Link to comment
Level 5* jefito 5,586 Posted June 24, 2014 Level 5* Share Posted June 24, 2014 Oh, I'm sure it's OCR'd, but that's just a fancy word for 'guessing' -- it doesn't mean that it always gets every word just exactly perfect. This is based on what Adjusting is reporting. And to see what the guesses are, you can examine the recoText items in a .ENEX file. I'll clip the image to my account to check it out. Link to comment
Level 5 Adjusting 276 Posted June 24, 2014 Level 5 Share Posted June 24, 2014 Adjusting, any insight into why the desktop client might persist in saying the image has not been indexed? No idea. I've reported the issue internally, so someone will look into it. I'll let you know if we find anything. Link to comment
ScottLougheed 1,316 Posted June 25, 2014 Share Posted June 25, 2014 Adjusting, any insight into why the desktop client might persist in saying the image has not been indexed? No idea. I've reported the issue internally, so someone will look into it. I'll let you know if we find anything. Cheers! Link to comment
BurgersNFries 2,407 Posted June 25, 2014 Share Posted June 25, 2014 As Jefito said, OCR'ing images is "guessing". OCR'ing text is more accurate, but still not dead on, IME. When OCR'ing images, a tree of possibilities is created to allow for low res camera phones and poor handwriting. So the word 'house' may show up when looking for 'horse'. IMO, if you want to be able to find the notes using certain search terms, it's best to add those terms (as keywords) to the note. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.