Jump to content
  • 0

(Archived) OCR improvements


Lougoose

Idea

I love Evernote's OCR abilities (even though the OCR seems to not be able to read my handwriting at all)…

That being said, there is always room for improvement, and I assume that the Evernote team is working on improvements.

When you do make changes/improvements to the OCR, do you re-OCR every past image/PDF, or just new ones?

Just wondering.

Link to comment

3 replies to this idea

Recommended Posts

When we change our software, we tend to just process new images and PDFs rather than retroactively changing older ones unless the quality is drastically better for some users. We want to avoid making any changes that would make any of your older searches fail ... e.g. if a new algorithm misses a word that an old algorithm found, then you might not be able to search for that old note any more.

For handwriting, you'll find that you get much better results with JPEG images.

Link to comment

Interesting, thanks!

Is there any way to have old notes re-processed? Or perhaps only have this function for premium users (I am one, but still).

Also, I don't want you to give away any trade-secrets, but why is the OCR better in JPGs (for handwriting anyway)? Does kinda get in the way when I'm scanning my class notes (which obviously work best with PDFs, with multiple pages, etc.).

Thanks!

Link to comment

There's no way to force the service to reprocess your files, unless you change the file (e.g. by adding a pixel to the JPEG), which basically makes it a different file.

Our PDF solution is basically a "best of breed" PDF OCR system that produces an OCR version of a PDF, with one answer per word. This allows you to copy and paste from the "searchable" PDF, for example.

Our JPEG solution is more of a hybrid that produces a custom internal XML format that creates a tree of possibilities for every word in the document. This tree of possibilities makes it more appropriate for fuzzy images and handwriting, but it can also result in more "false positive" matches where you incorrectly "find" an image when you search.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...