Jump to content
Kay in t Veen

mac Bulk Process original PDFs for OCR

Recommended Posts

Hi all,

I'm a week in my Evernote workflow. So far so good have like 500 documents in and things are great.

I noticed that the OCR of the ScanSnap 1300i slowed things down for me a lot. S I turned it off to let Evernote handle it. Which is some what great but i noticed one problem. Te OCR of Evernote does not make documen text selectable while adobes or scan snaps OCR does.

What I would like to do is batch process all originals for background OCR (just found a better way so I doesn't slow it down) but is there a way I can bulk change originals without losing notes tags and the folder that they where in?

Share this post


Link to post

Don't know of any way to change what's already in Evernote apart from one at a time..  I found the best work process for new scans was - 

  • Scan to folder
  • Edit name / merge or edit PDF as necessary
  • continue until bored..
  • Batch OCR folder in Adobe (will reduce file sizes too as it replaces pictures with text)
  • Sort folder on title
  • Drag and drop files to notebook(s) / import folder as required

(I don't use tags overmuch)

Share this post


Link to post

Don't know of any way to change what's already in Evernote apart from one at a time..  I found the best work process for new scans was - 

  • Scan to folder
  • Edit name / merge or edit PDF as necessary
  • continue until bored..
  • Batch OCR folder in Adobe (will reduce file sizes too as it replaces pictures with text)
  • Sort folder on title
  • Drag and drop files to notebook(s) / import folder as required

(I don't use tags overmuch)

That sort of sums up my procedure of choice as well now. but allready did between 400 and 500 multipage scans. hope i can do something with that. there is no way to go to the source folder somewhere in my mac's library do batch ocr and resync or something. not sure if that will work. 

Share this post


Link to post

I'd strongly suggest that you check with Support first (see below for the link) - I don't know what changing the contents directly would do,  but I wouldn't be too hopeful..  also if you changed a number and the contents then re-synced,  you'd be pushing your upload limits which might complicate (and worsen) the issue.

Share this post


Link to post

Yeah good idea, will check

because i found a way via ~/Library/Containers/com.evernote.Evernote/Data/Library/Application Support/Evernote/accounts/Evernote/ and then search for all pdfs, then i could drag them in adobe acrobat and ocr them. which does work very well. also in evernote they turn up great. the only thing what i should do is force a resync to the evernote servers. of course i need to be carefull with the upload limits. but i only uploaded like 300mb which will be reset in 20 days. but thats fine with me. 
 

maybe someone know how to force resync?

Share this post


Link to post

nobody a good idea? 

i have a great workflow now. working with hazel and pdfpen, all is fully automated scan->folder->rename->ocr->outboxfolder->evernote->processedfolder

but still no clue if there is a good way to ocr the old files and force a sync to evernote. 

Share this post


Link to post

Don't know of any way to change what's already in Evernote apart from one at a time..  I found the best work process for new scans was - 

  • Scan to folder
  • Edit name / merge or edit PDF as necessary
  • continue until bored..
  • Batch OCR folder in Adobe (will reduce file sizes too as it replaces pictures with text)
  • Sort folder on title
  • Drag and drop files to notebook(s) / import folder as required

(I don't use tags overmuch)

What Adobe application did you use to Batch OCR? 

Share this post


Link to post

Adobe Acrobat 9.0 - came bundled with the ScanSnap 1500

Share this post


Link to post

×
×
  • Create New...