Jump to content

(Archived) PDF processing / TEXT-I-fication


Recommended Posts

I'm a Premium User but have never really paid much attention to Evernote's ability to convert "image based" pdf's into text-based "searchable" documents.

But, now I've come up with a real need to "get the the text data out" of a number of "image" pdf's. I 've had mixed experience with OCR software in the past -- it's EXHAUSTING to scan through an OCR'd doc trying to catch all the errors, I'm hoping that Evernote will provide a solution.

Question: is a pdf like this a "possible" for text-I-fication?

Camillus_1954_6_sm.jpg

Link to comment

I'm hoping that handwriting in a PDF will be better as all my batch scans of handwritten items are multipage and it's easier to store a multipage PDF rather than hundreds of jpegs. Is this possible evernote? I'm a premium user and didn't notice any changes/differences

Link to comment
  • Level 5

But, now I've come up with a real need to "get the the text data out" of a number of "image" pdf's. I 've had mixed experience with OCR software in the past -- it's EXHAUSTING to scan through an OCR'd doc trying to catch all the errors, I'm hoping that Evernote will provide a solution.

Question: is a pdf like this a "possible" for text-I-fication?

Well, you are a premium member (fast OCR processing) and you have the document, so give it a try yourself.

Just keep in mind, that Evernote does character recognition to help you search quickly for a word. They call it a searchable document.

It will not actually create a literal word-for-word output of your document.

Link to comment
......

Question: is a pdf like this a "possible" for text-I-fication?

Well, you are a premium member (fast OCR processing) and you have the document, so give it a try yourself.

Just keep in mind, that Evernote does character recognition to help you search quickly for a word. They call it a searchable document.

It will not actually create a literal word-for-word output of your document.

I think I've already initiated a "test" in a sense. Yesterday I created several notes which are (were) image-based pdfs (jpg' scans which, using PDF Creator, I converted to pdf's). I have synced Evernote probably half a dozen times in past 24 hours. As far as I can tell there has been no change in the make-up of the notes.

The "searchable document" -- are you able to select individual words? copy them?

Link to comment
  • Level 5
I have synced Evernote probably half a dozen times in past 24 hours. As far as I can tell there has been no change in the make-up of the notes.

You won't see anything different. Evernote creates a 2nd searchable document on their server that matches your document.

If the OCR is finished (premium should only take a few minutes), then you should be able to search for the words in the document.

The "searchable document" -- are you able to select individual words? copy them?

The searchable document is primarily for Evernote's purposes and is not helpful to you. If you want to look at it, right click on the document and select the "Save Searchable PDF..." and save it to your hardrive. You will see why they call it a searchable document. It is not actually literal word-for-word output of your document.

You can see the difference at this link.

http://forum.evernote.com/phpbb/viewtopic.php?f=56&t=24599&p=106153#p105876

.

Link to comment

i still find OCRs a bit unreliable. i think i would spend more time trying to catch errors than typing a document again. that might be because i have good typing skills, the documents that I handle are not too long to still bother with OCRs, or I'm anal retentive and obsessive compulsive at the same time, i dont trust the ocr just yet. but then again, the ocr here is great. I'm starting to rethink my stand on this issue.

Link to comment
......i think i would spend more time trying to catch errors than typing a document......

That's how I've always felt.

Okay, and now, for whatever reason, the pdf's I have "noted" into EN are NOT BEING SYNCED! ??? I've clicked on "Note History" and am being informed that the notes haven't yet been synced. I'm advised to sync Evernote and "try again" ???!!!

Other notes I've created/clipped SINCE the pdf's in question HAVE been synced!

HELP!

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...