Jump to content

(Archived) PDF OCR proccessing


Recommended Posts

Hello,

Sorry if this is one of those common questions everyone asks but I have a little problem with Evernotes OCR and PDF files. I just got my new ScanSnap scanner and I signed up for a premium Evernote account. I uploaded quite a few documents lastnight (about 150) and none of them seem to be searchable yet. I was wondering if I'm maybe doing something wrong?

I have the quality on my scanner set to "better" and the scans seem crystal clear and I have the scan software set so that its doesnt OCR the files as I thought this was done by Evernote once the files had been uploaded.

Am I just being a little impatient or is something amiss? I stopped scanning last night at midnight GMT.

Thanks

Link to comment

I probably should have mentioned that I scanned my paper invoices and converted them to PDF. Does this mean they aren't OCR'd ? I just tried scanning to JPG instead and the OCR works perfectly. I would have preferred to save them as PDF though :D

Link to comment
  • Level 5

To keep down the overall size of the PDF's, I let ScanSnap do the OCR to my documents before putting them into Evernote. This will prevent Evernote from creating a second version of the document.

In ScanSnap Manager, under File Option, just click on Searchable PDF (OCRs during the scan)

Link to comment

Unfortunately, someone (me :-( ) broke the OCR pipeline for PDFs late on Saturday night as part of a database optimization & upgrade. This means that new PDFs added to Premium accounts aren't being processed for text.

We'll fix the code early tomorrow morning (California time), and then we'll retroactively queue all of the new documents for processing. These should all be done processing tomorrow some time.

Sorry for the inconvenience ...

Link to comment

The PDF processing caught up some time last night, so things should be working fine again.

Mr. Letterman got in trouble for doing something a lot more fun than database schema upgrades. :-)

Link to comment

hectyre -

It sounds like you're using our Windows client. Since the Windows OS doesn't include any native support for the PDF format, we needed to license and bundle a third-party PDF rendering library in order to show any PDF preview when you look at your notes. That library doesn't support the search highlighting, but we aim to improve the PDF experience in the future (probably after 3.5 is all released and stable, since there's a pile of work to do there).

Thanks

Link to comment
The PDF processing caught up some time last night, so things should be working fine again.

Mr. Letterman got in trouble for doing something a lot more fun than database schema upgrades. :-)

:) True. But I'm guessing a week from now, no one will remember your goof up.

Link to comment

Yes I'm using the windows client, 3.1 for now until 3.5 gets its bugs ironed out but it works really well for me. Do you want us all to go over to the Foxit forum and pester them for you? Just kidding, keep up the good work :)

Link to comment

:-) Thanks anyway. The Foxit folks have been great. They have a lot of different technologies we could use, but the work and licensing is all on our side. I.e. The ball's in our court.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...