Jump to content

Why no OCR search on handwritten notes?


gleicher

Recommended Posts

I have been using Evernote for 2 months - its great so far, except:

It doesn't seem like it is doing handwriting recognition on scanned documents (which are uploaded as PDFs).

I was wondering: is this not working because I am not paying for premium (yet), so I don't get searching in PDFs? I was hoping to try this out before I pay for premium (chicken and egg here: I am only interested in premium if it is good at letting me search my hand-written notes).

Or is there some other explanation (like, handwriting recognition doesn't work in PDFs, or only works from photographs, or ...)

Thanks!

mike

Link to comment

For handwriting, you're better off with JPEG images. We're using more of a standard "OCR" package for PDFs, which is designed more for scanned documents. JPEG images are processed with a different system that assumes a lower quality of input (e.g. from camera phones). This produces a "tree" of possible matches for every word, which works better for handwriting.

Link to comment

Thanks - is there an easy way to convert a PDF to an image format (preferably after uploading)? (my fast scanners all give me multi-page PDFs which is really convenient, except for this). Is there a good image file format to handle the multi-page documents (if i upload each page as an image, i'll need to keep the ordering)?

thanks,

mike

Link to comment

Check your scanner software settings. You should be able to specify you want the scans to be saved as an image rather than PDF. The ease of file naming will depend upon your software. The version of Paperport that came with my Xerox Dcoumate 510 will let me load up the automatic document feeder with a stack of pages. But if I'm scanning each page as an image, I have to manually add the file name after each page is scanned. (After providing a unique file name, the software then scans the next page in the feeder, asks the name and so on & so on.)

OTOH, the software that came with my HP Scanjet 4890 will automatically append _xxxx (where xxxx is the next available sequential number) to the file name.

I suppose if I wanted to get really trick, I'd try to integrate the HP software (automatic file naming) with the Xerox scanner (automatic document feeder), but this isn't really a high priority for me, since I save all my multipage documents as PDFs anyway. (And therefore only have to assign one name per document.)

If you have many multipage, handwritten documents you're intent on getting OCR'd by EN, it may be pretty kludgy. A better option (IMO) may be to save the multi page, handwritten documents as PDFs & add in keywords or a descriptive title. IMO, it's pretty cumbersome to try to look through a multi page document, when each page is scanned as an image.

Link to comment

thank you

unfortunately, most of the time things come from an industrial-strength scanner that emails me pdf files - there are no options. my little scanner is too slow to deal with stacks of paper. even when i do use the little scanner (an HP multi-function), i usually want to run things through Acrobat Pro's "optimize scanned files" when they come off that device.

(as an aside - since this is really about workflow - jpeg compression on scanned handwriting (with the sharp edges) won't be so great.the adaptive compression schemes in acrobat can pick schemes that work well on the line drawings (like handwritten text) or perceptual DCT methods (like JPEG) when needed. and lossless image compression would lead to huge files since they can't deal with the scanning noise gracefully)

i guess the short answer is that evernote isn't as ideal for dealing with my scanned notes as i would have hoped, and that the premium product won't help much. its still handy to have all the PDFs in one place and to be able to browse them quickly.

Link to comment
  • 5 months later...

I've been using Penultimate on the iPad to keep handwritten notes and then emailing them as multipage PDFs to my Evernote account. I had assumed that I would then be able to search these notes using Evernote once they'd been OCR'd.

I later found that none of these notes had been OCR'd but when they were sent in another image format such as JPG the handwriting was in fact OCRd.

Is EN ignoring anything but typed text PDFs for OCR? I say this because I've been successfully in that but not in handwritten PDFs.

Why not send in JPG or PNG handwritten OCRs? Simple: Its because I can't have multipage documents in these formats like I can do with PDFs.

Any help from Evernote Support on how to get around this limitation?

Link to comment

I am betting the PDFs are actually being OCRd…Evernote's PDF OCR tech just isn't as good (no offense Evernote).

When Evernote OCRs an image, they create multiple possibilities for each word. So if you write "Lap", you may be able to search for "lap" or "lop" or "lup" or dozens of other possibilities, and they will all show up. (just an example, not necessarily true).

However, PDFs are OCRd differently. They are limited to a single word, so that you can export the PDF as a "searchable" PDF - one that has been converted to text. Obviously they can't include 20 possibilities for each word in that PDF, it would just jumbled garbage. Therefore, PDF OCR is a lot harder with handwriting.

To check that your notes have indeed been OCRd, right-click them and choose "Save Searchable PDF" - assuming the option is there, you might find it is jumbled nothingness…I know I have had such an experience more than once.

It's less than ideal, but at least we know the issue is out there and can hopefully work around it.

Link to comment

Also, I THINK the PDF OCR functionality is a premium-only feature, although it has been a while since I was non-premium, so I could be remembering that wrong. Something else to check though.

Link to comment

We only do OCR on PDFs for Premium accounts. The way we handle PDFs is a bit different than images, since we assume a clean scan of printed text, and we produce a single transcription of the PDF. For images, we have more of a "fuzzy" match that produces trees of possibility that can be searched (but not easily extracted). So handwriting will usually not be handled very well on PDFs, but will work well in JPEG documents.

Link to comment

Is there any way this can be changed? I've already scanned hundreds of hand written notes into evernote as PDF over the last year expecting the OCR to work the same as images.

To go back and now export those PDFs as images would take me an incredible amount of time.

I've lost one of main advantages I thought I had with evernote.

I only realized this problem today, and found this thread, when I couldn't find a document I was searching for.

I don't care about the PDF being searchable, I care about my evernote documents being searchable using the same algorithm that is used for images.

Any help with this from anyone?

Link to comment

Is there any way to extract all PDFs that are in evernote and convert them to images? I have about 450 that need to be converted because of this limitation. Many of them are multi-page.

Thanks for any help.

Link to comment

The best method I found online (free or paid, really) is to use Preview and iPhoto…

The method will be a bit time consuming, since there isn't really an easy way to do it, but with a bit of patience you will manage. If you put in let's say, 1 hour each day, you will probably hit the free account upload limit (40MB I believe) before the month. So that will probably be the bottleneck (that's just a guess though).

Basically, the method (try it first once or twice to make sure it does exactly what you want (so you get proper quality and everything):

1. Open the PDF in Preview.

2. File>Print>PDF>Save PDF to iPhoto. --- this saves JPG versions of each page to a new album called "Album"

3. Drag those pictures to your Evernote note. Delete the original PDF (unless you have a reason to keep it).

4. Delete pictures from iPhoto (so you don't get confused).

It will be a hassle, no doubt, but there isn't a really easy way to do this, especially when you consider keeping notes in-tact (title, tags, notebook, etc.). You may be able to write an Applescript or Automator script for part of this, but it may be easier (and more comforting) to do it yourself. After all, with that many PDFs, you will probably not be able to do them all in 1 month anyway.

Link to comment
Is there any way to extract all PDFs that are in evernote and convert them to images? I have about 450 that need to be converted because of this limitation. Many of them are multi-page.

Thanks for any help.

Lougoose sounds like he's spent a bit of time on this. But you could Google on the words PDF JPG convert. Also, whatever method you go with, I'd suggest you test the results, to confirm things are working the way you think they are, before doing 450 documents.

Link to comment

Thanks for the help.

I think I can find some software to batch convert to images, but how do I batch extract 450 PDF documents from evernote?

The only way I can see is to go into each note and save the document out...that would take me a quite a while with that many PDFs. :-/

Maybe I'll just manually copy images and do about 4 a day, then in 4 months I'll have all my documents out as JPG.

Link to comment
  • Level 5

Whoa! Sounds like total overkill in my opinion.

I posted my suggestion on one of your other posts.

You certainly don't need Evernote to OCR every word in your PDF. Words like the, of, to, and, a, in, is, it, you, etc.

So, just type in some of the key words and main thoughts in the space above or below the PDF. This might not be painless, but it will be a lot easier and faster than converting to JPG's. And with keystroked characters, your search results will be more accurate.

Link to comment

That is a good idea, except my handwritten notes are almost always full pages of text (I don't like to waste space).

To retype, even the most relevant words, might be slower than just copying the images from the PDFs into evernote.

I don't understand why evernote's searching capability needs to conform to PDFs searching capability. Treat them as images for search indexing (they can already extract images from PDFs as shown by the preview).

Link to comment
I don't understand why evernote's searching capability needs to conform to PDFs searching capability. Treat them as images for search indexing (they can already extract images from PDFs as shown by the preview).

Lougoose explained it pretty clearly, I thought. But if you need more info:

viewtopic.php?f=38&t=15526&p=61511&hilit=best+of+breed#p61511

Link to comment
  • Level 5

Just curious - why would you have 450 full pages of handwritten text?

The last time I hand-wrote a page of text was decades ago while in college.

If the information is really important, you can easily farm the transcription job out for cheap money. Since it is already digitized, you could send it off to a company in India (or elsewhere) for the transcription. I just had 1,400 35mm slides digitized and converted to hi-res JPG's in India for a mere pittance.

Link to comment

Unfortunately, I don't have time to wait for a computer to boot up when taking most of my notes. That and it doesn't feel as personal to my clients when I'm typing on a laptop and clicking away...it's a perception thing for them.

And I wish it was just simple text, but often it's workflow diagrams, drawings of objects in 3d space...and lots of notes on those items as well.

I'd hate to think that I'm unique with my request.

It seems like all the pieces are there:

1) Image extraction for PDFs - Check

2) OCR algorithm for hand written notes in images - Check

It's just putting the two together now.

Of course, it might not solve my problem of having already scanned a years worth of notes into PDF.

Link to comment
  • Level 5

Someone else might come up with a better idea. Perhaps a Chemistry or a Math major could chime in with their method. They do a lot of drawings, special symbols and text.

If converting from PDF to JPG is too time consuming, or typing key words and phrases does not work, or paying for a transcription service does not work, I would suggest you rely on the search method you've been using for searching these documents prior to Evernote.

Benefits: no extra time spent on file conversions, no extra typing required, no extra cost

Link to comment

I'm going to go with the PDF to JPG method. 5 minutes a day for the next few months isn't too bad. Persistence always pays off!

And before evernote...I had nothing! It was all memory and stacks of notebooks in a shelf.

Thank you very much for your suggestions, I do appreciate everyone taking the time to respond.

Link to comment

Persistence definitely pays off. And you have the right idea…just do a few each day and you'll be done before you know it (well, you'll know when you're done, but you get the idea).

If you do hit the monthly limit, you can still put new notes into Local notebooks…then you can move them when the new month comes around. They won't be OCRd until they are in Synchronized notebooks, but at least the conversion is done. Alternatively, you could switch to a Premium account for a month or two ($5/month), so that you get this whole project done faster.

I would say wait until you see how many you do each day and if you do hit the limit before going Premium though. May as well be sure you need it first.

Just a thought!

Link to comment
We only do OCR on PDFs for Premium accounts. The way we handle PDFs is a bit different than images, since we assume a clean scan of printed text, and we produce a single transcription of the PDF. For images, we have more of a "fuzzy" match that produces trees of possibility that can be searched (but not easily extracted). So handwriting will usually not be handled very well on PDFs, but will work well in JPEG documents.

Ah drat! I wish you guys at Evernote had told us about your 'assumption' explicitly. I suspected that handwritten notes were not being OCR'd like handwritten OCRs and this is now a pain.

Any way you can make this some kind of user selectable feature?

BTW I am one of your PREMIUM evernote customers. I think you once said we represented 1 or 2% of your users so I think we get special treatment =)

I'm really liking 'Penultimate' on my iPad for capturing multipage handwritten notes and then emailing them as PDFs to Evernote. All this time (couple of weeks) I thought they were being OCRd until I went to search for something and got a bit fat ZERO.

thanks.

Link to comment

I am also a premium user if that makes much of a difference.

I even bought the Evernote notepad from the trunk to take more handwritten notes.

Please add this as a feature soon!

Link to comment
  • 5 weeks later...

This forum has been quiet for almost a month and i don't want the Evernote folks to get complacent.

I'm annoyed at their implicit admitted assumption that PDF OCR only applied to typed text and not handwriting since the behavior we've been constantly told about has been that Evernote OCRs all handwriting.

I also paid for OCRing of my PDFs when i upgraded to premium.

The least they could do as a fix would be to add a user selectable option for 'scan PDF for handwriting'.

Link to comment
This forum has been quiet for almost a month and i don't want the Evernote folks to get complacent.

I'm annoyed at their implicit admitted assumption that PDF OCR only applied to typed text and not handwriting since the behavior we've been constantly told about has been that Evernote OCRs all handwriting.

I also paid for OCRing of my PDFs when i upgraded to premium.

The least they could do as a fix would be to add a user selectable option for 'scan PDF for handwriting'.

Yeah, well good luck with that. First, I don't know that EN has even stated this is something they choose to do.

Second, if it is something they choose to do, they must balance their resources. I would guess this is not an easy change & this is not the only thing their users are clamoring for. It's not something where they just put in a user selectable option b/c it affects how the indexing on the servers works. So even if they choose to do it & today it's the number one thing to accomplish on their To Do list, it's probably going to take several months to accomplish.

Third, the OCR'ing you "paid" for with your premium subscription is the OCR you've got right now. IF you want to OCR handwriting, there are several good OCR programs out there. Of course the better ones are the ones that are a bit pricey. But if you want it NOW, that's going to be the only option I see at this point and for quite a while.

Fourth, you say "since the behavior we've been constantly told about has been that Evernote OCRs all handwriting." Cite please.

Link to comment

Evernote pushes handwriting recognition as a big part of it's advertising. It's what got me into it as well. Go to evernote.com and it says "Capture Everything" and has handwritten notes on it. Then it says "Find things fast, Search by title, tag or even printed and handwritten text inside images".

It does say "inside images"...so I will have to concede they were upfront about it, but it would be nice if there was a "No PDF" disclaimer on there. I consider PDF an image format because I can open it with photoshop and save it back to PDF.

The fact that they don't do it for PDF doesn't make sense...they made an assumption that wasn't true and it needs to be revived every once in a while so that it gets fixed.

I still love the product, but this one feature really hurts me when it comes to multiple page notes.

Link to comment

The OCR technology we've licensed to process PDF documents doesn't yet contain great handwriting recognition, but this is something on the list for R&D to improve in the future.

Thanks

Link to comment
The OCR technology we've licensed to process PDF documents doesn't yet contain great handwriting recognition, but this is something on the list for R&D to improve in the future.

Thanks

Dave - thanks for taking the time to highlight this and i hope my comment didn't come across earlier as overly aggressive. i think evernote is great relative to what's out there but wonder if its not possible for someone else to create a competitor that tempts your users away with more congruent features that address these 'little things'.

regarding the point you made about the software, while i accept that may be a limitation to PDF scans, wouldn't there be a way to incorporate my earlier point about a 'scan for handwriting' optional checkbox that would then run the attached PDF through your handwriting image processor? just an idea especially considering that all image manipulation and viewing apps i've used on OSX do open PDFs.

Link to comment

We recognize that this limitation doesn't make much sense from the user's perspective, but we don't have a simple fix for this due to the design of the PDF OCR software that we're using. (I.e. it behaves as a "black box" that doesn't allow us to insert our own custom handwriting solution into the mix.)

This has come up with R&D several times, and we do hope to improve this in the future.

Link to comment
  • 1 month later...

Hi,

if I take a photo of an handwritten text, is this recognized by Evernote iPhone version?

Moreover, I've tried to do a search on a pdf and it seems that pdf searching is available also on the free version (without highlighting), is it right or not?

Link to comment

If you take a JPEG picture of something with handwritten text, Evernote should do a decent job of searching for that note. (It may take up to a few hours to process the image, however.)

If you use a PDF document, handwriting isn't recognized as well. I.e. PDF scans for Premium accounts give good results for scanned printed text, but not great for handwriting.

Link to comment

If your original PDF contains text that you can select/copy/paste, then you can put that into your account and search for it even if you have a Free account.

If your PDF is a scan with no selectable text in it, then you won't be able to search for that from a Free account.

If you have a Premium account, we'll do OCR on that scan to produce a "searchable" version, and index that for searching.

Link to comment
  • 4 weeks later...

I've decided to write a program to accomplish this since I think there is a need for it.

Here's what I have so far:

* Check all notes to see which ones are just .pdf and text

* downloads pdf

* converts to .png images of decent quality

* adds them back into the note and upload

Does anyone have any suggestions? I'm thinking of selling it for $2/month and it'll constantly scan your account for new .pdfs. Want to get some feedback before I get it into production quality.

Link to comment
  • 3 months later...

for converting the pdfs to images, if you have access to a linux machine, I suggest Imagemagick.

from the command line you can simply navigate to the folder full of pdfs and run:

convert *.pdf new.png

you'll end up with tons of images named new-1.png, new-2.png, etc.

You might be able to run

convert *.pdf *.png

and preserve file names, but I'm not a my linux box so I can't test it.

this will also work to convert them to jpgs or a ton of other formats

for windows users, google yielded this list:

http://www.makeuseof.com/tag/6-ways-to- ... pg-image/#

Link to comment

Late to the party here, but if ever there was a job that could take advantage of AppleScript (possibly with Automator), this looks like it to me.

It would require some scripting chops, but there are resources to help with that.

Link to comment

FYI - we're getting a lot closer to deploying a new PDF processing pipeline that will handle handwriting better than before. Hopefully we can deploy this in the next month or so.

Link to comment
  • Level 5
FYI - we're getting a lot closer to deploying a new PDF processing pipeline that will handle handwriting better than before. Hopefully we can deploy this in the next month or so.

When you release it, instead of saying it is improved, please give some additional information that llustrates the improvement.

Link to comment
  • 7 months later...

I have recently bought a new Apple iPad and I want to install one application on this iPad for editing the documents but the condition is that it should be having a Handwriting Recognition feature in it. Suppose I want to make an inspection sheet with questions and then save it to iTunes on my laptop. After moving this application to my iPad, I should be able to answer those questionnaire by writing with stylus. Does anybody know any such application?

Link to comment
  • 4 weeks later...

It may come as a bit of a shock that evernote doesn't support searching of handwritten notes in PDF files. (or it didnt when I last looked). Evernote only does it’s handwriting search magic on image files like JPEGs.

However, If you use a mac, you can use Automator to turn any PDF into a collection of images, then just drag those into an evernote.

In case anyone would find it useful, the automator instructions are on my blog:

http://angusbradley....the-handwriting

Or is there a better way?

Link to comment

To clarify, I think EN does try to OCR handwriting in PDFs. The issue (and therefore the reason it's better to scan handwriting as jpgs) is this:

For handwriting, you're better off with JPEG images. We're using more of a standard "OCR" package for PDFs, which is designed more for scanned documents. JPEG images are processed with a different system that assumes a lower quality of input (e.g. from camera phones). This produces a "tree" of possible matches for every word, which works better for handwriting.

IOW, with text, it's very easy to distinguish "house" from "horse". It's either "house" or "horse". Period. With handwriting or lower res images of signs, the tree of possible matches means that an image with the word "house" may show up in a search for the word "horse" because it's not an exact science. My handwriting is horrible. (I attribute it to decades of keyboarding.) So I may write the word "house" but it really may look like "horse." Hence, the tree of possible matches...

Link to comment
  • 1 month later...

So was this ever updated? Do we have handwriting recognition in PDFs? With my new Boogie Board RIP this would be REALLY useful.

I've been searching for alternatives for evernote that I could switch to because of the non-existing handwriting recognition for PDF, but unfortunately none are available. As soon as someone has that feature I'm making the switch unless evernote gets it first.

So...do we have handwriting recognition in PDFs yet?

Link to comment
  • 1 month later...

Team!

I'm new to EN. I also did a ton of digging before I decided to create a thread. If this has been covered, apologies.

I have a small conundrum. In short, my scanned documents (PDF) do not get OCR processing by EN.. and I don't understand why. I'd really like for my scanned docs to be search-able.

I'm using a Xerox WC7655. I double checked this knowledge base entry to help make sure my documents meet the OCR processing criterion, they do.

https://support.ever...3+9&docID=12656

My doc is a full page meeting agenda with text, and some sloppy handwriting courtesy of yours truly. I can't attach it here because of sensitive business data on the page.

Here's my test to help troubleshoot the issue:

  1. Photograph document using iPhone, add to EN.
    • Result: EN performs OCR on text and sloppy handwriting. High-five, go team.

  • Scan same document with Xerox 7655, add to EN.
    • Result: Flat 'image' PDF in Adobe Reader and in EN. Not search-able, no EN OCR processing occurs, at all. Unsuccessful result.

    [*]Scan same document with Xerox 7655, but this time enabling OCR on the Xerox.

    • Result: Xerox OCR algorithm converts type, but not the handwriting. So it 'works', but not as well as the EN OCR processing. The EN support doc says that if OCR pre-exists, they don't process it.

    So I really want process #2 to work correctly and have EN perform the OCR work... any suggestion on what I might be doing wrong here?

Just to add, I'm not a Premium user (yet), I want to see that this works first. I have allowed a few hours to pass in order to work through the OCR queue.

Link to comment
  • Level 5*

Hi Chris

If you're not a prem user, your OCR will have to wait until we priviledged paid-for types get service. I suspect that may be your root problem. However. PDF OCR does not include hand-scrawled text, so if you're looking for a solution that covers the handwriting, JPGs may be the way to go. You mention that if you OCR this stuff yourself you get a result (subject to the minor omission mentioned above) - so what's wrong with continuing to OCR the stuff yourself so you don't have to wait in future?

- and bear in mind that you can submit two or more items per note.. so you could add an OCR'd PDF file for the typed text, and pics - or parts of pics - for the related handwritten comments.

Link to comment

I was going to say the same... you may just have to wait.

Many of us users (at least me) always conduct OCR prior to EN.

Lots of discussions about that but personally I don't reply on EN. Not that its not that good but if I ever export it... it will always be there ;)

Link to comment

Hi Chris

If you're not a prem user, your OCR will have to wait until we priviledged paid-for types get service. I suspect that may be your root problem. However. PDF OCR does not include hand-scrawled text, so if you're looking for a solution that covers the handwriting, JPGs may be the way to go. You mention that if you OCR this stuff yourself you get a result (subject to the minor omission mentioned above) - so what's wrong with continuing to OCR the stuff yourself so you don't have to wait in future?

- and bear in mind that you can submit two or more items per note.. so you could add an OCR'd PDF file for the typed text, and pics - or parts of pics - for the related handwritten comments.

Thanks for the reply!

I thought maybe the OCR queue might be the cause. What confused me was that the JPG photo taken with my iPhone completed the EN OCR process in a matter of minutes... while the scanned PDF has been sync'd to the cloud for about 2 days now. It could be my error by assuming that the OCR queue was agnostic to file types, but the JPG queue may be separate from the PDF queue (longer wait?). Not sure.

Regarding the 'handwriting'... I exaggerated slightly. My handwriting isn't completely mangled, just 'guy' handwriting. ;) I was shocked and impressed at how the EN OCR was able to detect and make search-able everything i had written on the page. A majority of the notes I'd like to scan are hand-written, that is why I got excited when I saw EN working so well with it. I agree that the Xerox OCR will work for typed docs.

My ideal solution would be to bulk scan the masses of stuff on my desk, and let EN allow me to search through everything. Typed, and within reason, handwritten. 100% detection of handwritten notes isn't realistic, and I realize that.

Thank you again for the reply!

Link to comment
  • Level 5*

No problem - if it helps I have around 7,000 notes with mixed JPG / DOC / PDF / Webclip and other content, some of the PDF files being well over the page limit for Evernote OCR. I OCR everything I can before uploading, with suitably titled and tagged notes. All of this is now electronic, but it used to be a six-foot high by around 10-foot long set of shelving groaning with folders of various types. Getting stuff filed into that scenario was daunting, and finding anything was.. unreliable. I now have this external hard drive sitting on my desk and do all my filing and finding more reliably and sitting down. And a wheelbarrow now occupies the previously allocated document storage area. I'd say you can pretty much rely on Evernote to tidy up your working area - all you have to do is get started!

- and I don't even work for these guys ;)

Link to comment
  • Level 5*

No problem - if it helps I have around 7,000 notes with mixed JPG / DOC / PDF / Webclip and other content, some of the PDF files being well over the page limit for Evernote OCR. I OCR everything I can before uploading, with suitably titled and tagged notes. All of this is now electronic, but it used to be a six-foot high by around 10-foot long set of shelving groaning with folders of various types. Getting stuff filed into that scenario was daunting, and finding anything was.. unreliable. I now have this external hard drive sitting on my desk and do all my filing and finding more reliably and sitting down. And a wheelbarrow now occupies the previously allocated document storage area. I'd say you can pretty much rely on Evernote to tidy up your working area - all you have to do is get started!

- and I don't even work for these guys ;)

similar experience here. primarily pdf and text. it took a really long time, but i have digitized several bookcases worth of books and notes. my file cabinets and file boxes have all been digitized as well. going paperless was definitely not easy, but it was well worth it. i also ocr before uploading.

Link to comment
  • 2 weeks later...

Ditto. I converted my entire library (> 500 books) to OCR'd PDFs. A local copy shop uses a hydraulic blade to chop off each spine ($1 each), and then I feed through my scansnap S1500M for OCR and storage. Any texts I'm actively working with get attached to an Evernote note (or, if > 20 MB, dropbox) for easy cross-platform access. I scan at 300 dpi and store at 150 dpi after text-under-image OCR. Searching and marking-up resulting PDFs is plenty fast on iPad2 and mac, linux, and windows desktops.

Link to comment
  • Level 5*

Ditto. I converted my entire library (> 500 books) to OCR'd PDFs. A local copy shop uses a hydraulic blade to chop off each spine ($1 each), and then I feed through my scansnap S1500M for OCR and storage. Any texts I'm actively working with get attached to an Evernote note (or, if > 20 MB, dropbox) for easy cross-platform access. I scan at 300 dpi and store at 150 dpi after text-under-image OCR. Searching and marking-up resulting PDFs is plenty fast on iPad2 and mac, linux, and windows desktops.

that's great!

i have slowly been converting my library to pdfs, and i have done a few thousand books and articles now. it is actually pretty easy to tear apart a book yourself (manageable sections of a few dozen pages at a time), trim it (if you have a paper guillotine, then the process goes a bit more smoothly), feed it through the scan snap (ideally, an office-quality scanner with 600 dpi is the way to go), and ocr it. the ipad 2 is ok, but i wouldn't call it fast. i have high expectations for the ipad 3!

Link to comment
  • 3 weeks later...
  • 2 months later...

I've scanned and ocr'ed in pdf format a load of notes which contained a lot of my handwriting comments along with normal print. However, when I moved them over to evernote and tried searching for some handwritten words in the search box....nothing turned up. Zero.

Am I doing something wrong or what? Really frustrated that none on my handwritten notes can be searched :(

Link to comment
  • Level 5

Yeah I'm a premium member. Signed up last up last week. Can't understand why I can't search handwriting?

How bad is your handwriting? (just kidding)

What is the type of document? PDF, jpeg?

Are the documents with the handwriting in a synchronized notebook?

Local notebooks don't get OCR'd.

Link to comment

Yeah I'm a premium member. Signed up last up last week. Can't understand why I can't search handwriting?

How bad is your handwriting? (just kidding)

What is the type of document? PDF, jpeg?

Are the documents with the handwriting in a synchronized notebook?

Local notebooks don't get OCR'd.

lol. Handwriting is fine :)

They are in pdf format and in a synchronized nb.

Link to comment
  • 3 weeks later...
  • 4 weeks later...
  • 3 months later...

I took a photo of my handwritten notes (I have quite good penmanship if I must say so myself) using my iPad. I assume the iPad saves the photos as jpg but when I bring into EN it doesn't recognize my handwriting. Every so often it may pull up a note that my search term is in but it never highlights it.

Am I missing something? I'm not a premium subscriber but I didn't think I needed to be to do what I'm trying to do.

Link to comment

To me it seems Evernote handles different formats differently. PDF search of printed letters always works very well, even if I rotated the paper for 30 degrees, pretty awesome. My handwriting is not that good, so it does not always recognize well.

Link to comment
  • 2 months later...

For handwriting, you're better off with JPEG images. We're using more of a standard "OCR" package for PDFs, which is designed more for scanned documents. JPEG images are processed with a different system that assumes a lower quality of input (e.g. from camera phones). This produces a "tree" of possible matches for every word, which works better for handwriting.

What about PNG files? It seems many, if not most, Android (handwriting) applications generate PNG images if you export to another app (and many apps export direct to Evernote using this format). But I have read elsewhere that Evernote does not OCR handwriting in PNG files as well as it does for JPEG. This, seems odd, since Evernote itself generates files in PNG format.

Also, its seem that the OCR for handwriting works better on script than on block letters. Is this true?

Link to comment
  • 2 months later...

I use Evernote to record meeting minutes on a constant basis. I use 3 different methods.

- Typewritten into an Evernote note either on Windows computer or iPad App

- Handwritten in either Penultimate or Noteshelf on iPad and synced to Evernote

- photo of handwritten notes on my Evernote Moleskine notebook.

All 3 are great methods especially since I am a prolific note taker. Problem is, on the handwritten notes I'm getting really poor results when it comes to searching the notes for a word. I have thousands of notes and if they're not searchable they're pretty much useless. It seems to be a hit or miss affair with a success rate of about 10%.

My handwriting is excellent (I print in capital letters). People compliment my handwriting all the time.

Am I doing something wrong? Anyone else have this issue? Any recommendations?

Link to comment
  • Level 5*

Hi - welcome to the forum. Not saying its related, but for information, AFAIK the way Evernote 'reads' pictures and PDFs is fundamentally different.  Pictures get a 'word tree' where part of the image might be horse,  or house, or hoops,  while PDF files have a text layer that contains whatever content the OCR process managed to glean.  I don't use any of these methods of note-taking myself,  so we'll have to wait for some more informed comment.  But you might like to have a look at any notes that are in PDF format to see whether the success rate is better than those with a picture base.  If things really aren't working as they should,  please raise a support request (see below) so that one of the techs can have a look.  If anyone else is able to compare notes (pun) on the different ways to get handwriting into Evernote,  you should get some comments shortly..

Link to comment
  • 1 month later...

I've scanned and ocr'ed in pdf format a load of notes which contained a lot of my handwriting comments along with normal print. However, when I moved them over to evernote and tried searching for some handwritten words in the search box....nothing turned up. Zero.

Am I doing something wrong or what? Really frustrated that none on my handwritten notes can be searched :(

yes you are doing it wrong , "thats kind of loop hole evernote is keeping closed " Evernote obviously don't want anyone to upload over the limit boz they get profit from subscriptions , converting to PDF is kind of cheating the size of a image ,all you can do now is to covert ur pdf to images and upload so that it can do handwriting OCR for you

Evernote limit is good enough for daily users , but for heavy users "it's not a way ", google may have a better option for them using free 0CR on tiff , no normal person would anyway go beyond the limits , so u will be just fine ....

Link to comment

I've scanned and ocr'ed in pdf format a load of notes which contained a lot of my handwriting comments along with normal print. However, when I moved them over to evernote and tried searching for some handwritten words in the search box....nothing turned up. Zero.

Am I doing something wrong or what? Really frustrated that none on my handwritten notes can be searched :(

yes you are doing it wrong , "thats kind of loop hole evernote is keeping closed " Evernote obviously don't want anyone to upload over the limit boz they get profit from subscriptions , converting to PDF is kind of cheating the size of a image ,all you can do now is to covert ur pdf to images and upload so that it can do handwriting OCR for you

Evernote limit is good enough for daily users , but for heavy users "it's not a way ", google may have a better option for them using free 0CR on tiff , no normal person would anyway go beyond the limits , so u will be just fine ....

Link to comment

I've scanned and ocr'ed in pdf format a load of notes which contained a lot of my handwriting comments along with normal print. However, when I moved them over to evernote and tried searching for some handwritten words in the search box....nothing turned up. Zero.

Am I doing something wrong or what? Really frustrated that none on my handwritten notes can be searched :(

try PDFill in PC to convert PDF to images ...

Link to comment
  • 2 weeks later...

When I was in fourth grade, I handed in a multipage report on the Civil War in my best cursive.

 

I got it back with a note from the teacher which said, "I can't read this. Don't ever handwrite anything again."

 

Best lesson I ever learned. Since then, I've printed everything except my signature.

 

Evernote does amazingly well with literally thousands of hand-printed pages I've scanned as PDFs.

 

Why oh why are we still spending time & money to teach kids cursive writing?

Link to comment
  • 4 weeks later...
  • 2 weeks later...

I'ver just bought my first Moleskin and also disappointed not one word has become searchable.  The stickers work fine, but my capitalised neat writing not showing at all, and i've left it an hour now, disappointed.

 

Maybe I need to try to write normally ratyher than capitals, but I know my normal writing is not so neat,.

Link to comment
  • Level 5*

I'ver just bought my first Moleskin and also disappointed not one word has become searchable.  The stickers work fine, but my capitalised neat writing not showing at all, and i've left it an hour now, disappointed.

 

Maybe I need to try to write normally ratyher than capitals, but I know my normal writing is not so neat,.

 

An hour? Have you synced with the server again to receive any results?  Not sure how fast the OCR would normally be,  but an hour may be a little soon..

Link to comment
  • 1 year later...

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...