Jump to content
mlu

paperless Fujitsu Scansnap OCR's despite text conversion deselected

Recommended Posts

I know this is not an Evernote issue as such, but as many of you guys have a Scansnap I wonder if anyone has run into this issue. Despite having deselected "Convert to searchable PDF" the program starts the text recognition process anyway. This happens on both mine and my wifes scanners. Please see attached our set-up under file options:

post-82887-0-59798700-1328347518_thumb.j

Share this post


Link to post

Interesting - I have fun with the SSManager window too. Try checking on the 'applications' tab - is that also showing "scan to Evernote"?

You could also try

  • setting up a new profile
  • editing another that you don't use or
  • scan to JPG and convert that to PDF externally

Share this post


Link to post

I was going to suggest you update your software but it appears you have the most recent version.

It doesn't look like you saved the profile.. Try saving it as a new name then verify it saves.. I was actually creating three more profiles with mine yesterday :-)

Share this post


Link to post

And from a different perspective:

It takes a bit longer, but I always let ScanSnap do the OCR for me.

Why?

1.) Exported PDFs:

ScanSnap:
The PDF document remains OCR'd if I export it from Evernote.

Evernote:
The PDF document loses its OCR if I export it from Evernote.

2.) Consistency:

ScanSnap:
The search results are consistent in Evernote, whether I view them from my desktop client or the Evernote web.

Evernote:
The search results are not consistent because Evernote uses different OCR software depending on the platform.

3.) 100% OCR:

ScanSnap:
OCR works on notes that are stored in my local non-sync'd Evernote notebooks.

Evernote:
Evernote cannot see my notes on my local non-sync'd notebooks, so the PDF's cannot be OCR'd.

4.) No rules:

ScanSnap:
OCR's all my PDF's - no rules and I know it is done instantly.

Evernote:
Evernote has 5 technical rules to follow and no warning if the document fails to meet all the rules

.
  • Like 1

Share this post


Link to post

Jbenson,

I saw this post (or an earlier version of it) before I started with ScanSnap and Evernote, so I'm also OCR'ing with SS. My question is to your point #4; what are these 5 magic rules?

Share this post


Link to post

Looks like another rule has been added.

Once the PDF gets to the front of the processing queue, the processor analyzes the file to ensure it qualifies to be processed/recognized. The processor will reject the PDF if any of the following conditions are met:

1.) If the PDF contains more than 100 pages

2.) If the PDF file is more than 25MB

3.) If the PDF does not contain at least one "scanned" page, defined as:

  • A "scanned" page contains at least 1025 pixels of image data
  • A "scanned" page contains no more than 512 characters of regular, searchable text (e.g. this is enough for a text-based fax header or similar). PDF files that have already been processed by a separate OCR system will not satisfy this condition and will be rejected.

4.) If the PDF contains no more than one non-scanned page. (I.e. the doc may have one "cover" page without any image data, but if there's more than one, than it's not a real scan and we reject it.)

5.) If the analysis crashes or fails for some technical reason, typically due to a malformed PDF from some crazy source, or if the PDF is password protected (encrypted).

6.) If this analysis process takes more than 30 seconds to complete.

Once the PDF has been deemed valid for processing, the PDF is run through our best-of-breed OCR engine which generates a searchable form of the same PDF.

https://support.ever...+48&docID=12083

Share this post


Link to post

The link provided fails (it ends up going to KB home page)


https://support.evernote.com/ics/support/KBAnswer.asp?questionID=591&hitOffset=357+250+48&docID=12083

https://support.ever...+48&docID=12083

This link works


https://support.evernote.com/ics/support/KBAnswer.asp?questionID=591

https://support.ever...?questionID=591

Not really sure what all the problems with the KB direct links are, but, that is the only form I have found that consistently works.

....

And when I checked the links after posting, both links work now. DIIK.

Share this post


Link to post

I had the same problem and this seems to be a bug in the ScanSnap software. To fix it, you can:

 

0. Exit ScanSnap Manager

 

1. Locate the preferences files for ScanSnap. Most likely in ~/Library/Preferences. Name should be something like "jp.co.pfu.ScanSnap.V10L10.plist".

 

2. Transform the .plist files into xml with plutil:

   plutil -convert xml1 <filename>

 

3. Open the file in a text editor, find the preference called "OCR" in the file and set the value (in the next line) to "0" instead of "1".

 

4. Start ScanSnap Manager.

 

This worked for me.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×
×
  • Create New...