Sari la conținut

Welcome! You're currently a Guest.

If you'd like to join in the Discussion, or access additional features in our forums, please sign in with your Evernote Account here. Have an Evernote Account but forgot your password? Reset it! Don't have an account yet? Create One! You'll need to set your Display Name before your first post.

Poză
Paperless

Fujitsu Scansnap: possible to add OCR later? How?

fujitsu scansnap ocr

  • Please log in to reply
10 răspunsuri la acest topic

#1 tijsterman

tijsterman

  • Punct
  • Title: Member
  • Group: Members
  • 10 posturi

Postat 10 februarie 2012 - 04:17

Hello all

As there are quite a lot of Fujitsu scansnap users on this forum, I hope somebody can answer my question.
I use my Scansnap S1300 to scan documents into PDF's into Evernote both with and without adding OCR. The advantage of adding OCR are evident but for large piles of paper it takes quite some extra time. Often, I doubt whether I really need OCR for a particular document. It would be very helpful if there was a way to add OCR at a later date when it turns out that I indeed want to search the text or copy paste fragments of it. Does anybody know whether this is possible and how? Many thanks.

#2 bduncan

bduncan

  • PunctPunct
  • Title: Alliance Lackey
  • Group: Members
  • 65 posturi

Postat 10 februarie 2012 - 04:28

So given where we are, I'd be remiss if i didn't mention that Evernote Premium will OCR your PDFs for you when you upload them, but I assume what you are wanting is to have the actual PDF OCRed independently of Evernote.

If you are on a Mac, you can drag the PDF onto the ABBYY FineReader icon in the Finder. I am on my iPad so this is from memory, but I believe it is under /Applications/ScanSnap/Scan to Searchable PDF or something like that.

If you are on Windows, I don't think that works, but what you can do is throw your ScanSnap-scanned PDFs into the ScanSnap Organizer software that came with your scanner. It can then OCR them for you.

There are of course a zillion other applications that will do this for you, but I am just pointing out some ways to do it with the software that came with your scanner.
Brooks Duncan, Paperless Geek
http://www.documentsnap.com

#3 tijsterman

tijsterman

  • Punct
  • Title: Member
  • Group: Members
  • 10 posturi

Postat 12 februarie 2012 - 05:19

So given where we are, I'd be remiss if i didn't mention that Evernote Premium will OCR your PDFs for you when you upload them, but I assume what you are wanting is to have the actual PDF OCRed independently of Evernote.

If you are on a Mac, you can drag the PDF onto the ABBYY FineReader icon in the Finder. I am on my iPad so this is from memory, but I believe it is under /Applications/ScanSnap/Scan to Searchable PDF or something like that.

If you are on Windows, I don't think that works, but what you can do is throw your ScanSnap-scanned PDFs into the ScanSnap Organizer software that came with your scanner. It can then OCR them for you.

There are of course a zillion other applications that will do this for you, but I am just pointing out some ways to do it with the software that came with your scanner.

Thanks for your competent answer. This was helpful to me.

#4 Michael Campbell

Michael Campbell

  • Punct
  • Title: Member
  • Group: Members
  • 31 posturi

Postat 14 februarie 2012 - 01:01

If you are on Windows, I don't think that works, but what you can do is throw your ScanSnap-scanned PDFs into the ScanSnap Organizer software that came with your scanner. It can then OCR them for you.


This was the magic nugget... thanks.

When I bought my ScanSnap it came with an Adobe product which will OCR also, but I've not yet tried it out fully. I know it can work, but I haven't tried saving the results to see if it embeds the OCR'd info in the pdf or not.

#5 kennychua

kennychua

  • Punct
  • Title: Member
  • Group: Members
  • 8 posturi

Postat 15 martie 2012 - 09:18

So, I agree that Scanning with ScanSnap and letting it OCR at the same time takes awhile. However, after scanning your pdf into a note, right click on the note and open with Adobe Acrobat. Run the OCR command, and close the note and it will automatically save the OCR'ed version over the regular version in your note.

That's what I do.

#6 GHall

GHall

  • PunctPunct
  • Title: Alliance Lackey
  • Group: Members
  • 67 posturi

Postat 22 martie 2012 - 12:30

So, I agree that Scanning with ScanSnap and letting it OCR at the same time takes awhile. However, after scanning your pdf into a note, right click on the note and open with Adobe Acrobat. Run the OCR command, and close the note and it will automatically save the OCR'ed version over the regular version in your note.

That's what I do.



So I understand your process, the PDF is scanned into an Evernote note first; and then you right click on the PDF located in the Evernote program and click "open with Adobe Acrobat"?

#7 GHall

GHall

  • PunctPunct
  • Title: Alliance Lackey
  • Group: Members
  • 67 posturi

Postat 22 martie 2012 - 12:37

My work flow includes the use of the following tools:
  • Fujitsu ScanSnap S1500M
  • Adobe Acrobat 9 Pro for Mac (included with scanner)
  • Evernote (free version)
1. Scan to high quality PDF
2. OCR in Acrobat Pro
3. Optimize in Acrobat Pro
4. Save in a single folder on local hard drive titled something like "OCR & Evernote"
5. Copy to Evernote and sync

File sizes are 1/10 or smaller of the original size and the Acrobat OCR is superior to ABBYY FineReader for Mac and PDF OCR X per my trials. Also, Acrobat 9 Pro for Mac works great on my Mac running Lion (10.7.3). I also like to see the archived PDF's in the original color schemes so I scan using "auto color detection".

Currently I use the free version of Evernote, though despite my initial reservation, Evernote functions great, so I'll likely upgrade to the "yearly" option. I initially hesitated to install Acrobat 9 Pro on my Mac due to the large amount of negative net chatter that I read. Instead I spent hours trying out various other OCR options. In the end, I installed Acrobat 9 Pro and found it works better than the others that I tried. It's a robust program and comes free with the ScanSnap S1500M.

One of the ways Acrobat 9 Pro is better than others that I've tried is its ability to OCR documents with multiple and inconsistent formatting. For example, some of my utility bills have headers with typical "to" and "from" info, usage charts that run the width of the document, narrow columns on one side of the document, additional tables of varying columns and rows and then paragraphs. This is all on one standard letter size document. Acrobat 9 Pro OCR'd 99% of the text correctly, including "$", "#", "@", and " " (spaces). I can not say the same for the other programs. The odd thing is I don't really want to like Adobe Acrobat 9 Pro. I want to prefer a program developed by a smaller, leaner competitor.

#8 kennychua

kennychua

  • Punct
  • Title: Member
  • Group: Members
  • 8 posturi

Postat 22 martie 2012 - 01:38

Yep, that's what I do. Which sounds like what you do..... :)

#9 idoc

idoc

  • PunctPunctPunctPunctPunct
  • Title: Browncoat
  • Group: Members
  • 314 posturi

Postat 21 aprilie 2012 - 05:31

My work flow includes the use of the following tools:

  • Fujitsu ScanSnap S1500M
  • Adobe Acrobat 9 Pro for Mac (included with scanner)
  • Evernote (free version)
1. Scan to high quality PDF
2. OCR in Acrobat Pro
3. Optimize in Acrobat Pro
4. Save in a single folder on local hard drive titled something like "OCR & Evernote"
5. Copy to Evernote and sync

File sizes are 1/10 or smaller of the original size and the Acrobat OCR is superior to ABBYY FineReader for Mac and PDF OCR X per my trials. Also, Acrobat 9 Pro for Mac works great on my Mac running Lion (10.7.3). I also like to see the archived PDF's in the original color schemes so I scan using "auto color detection".

Currently I use the free version of Evernote, though despite my initial reservation, Evernote functions great, so I'll likely upgrade to the "yearly" option. I initially hesitated to install Acrobat 9 Pro on my Mac due to the large amount of negative net chatter that I read. Instead I spent hours trying out various other OCR options. In the end, I installed Acrobat 9 Pro and found it works better than the others that I tried. It's a robust program and comes free with the ScanSnap S1500M.

One of the ways Acrobat 9 Pro is better than others that I've tried is its ability to OCR documents with multiple and inconsistent formatting. For example, some of my utility bills have headers with typical "to" and "from" info, usage charts that run the width of the document, narrow columns on one side of the document, additional tables of varying columns and rows and then paragraphs. This is all on one standard letter size document. Acrobat 9 Pro OCR'd 99% of the text correctly, including "$", "#", "@", and " " (spaces). I can not say the same for the other programs. The odd thing is I don't really want to like Adobe Acrobat 9 Pro. I want to prefer a program developed by a smaller, leaner competitor.



It should be noted that if you run the PDF optimizer on Acrobat it will automatically OCR for you at the same time that it optimizes. Furthermore, I've discovered that this combined process is a lot faster than having snapscan do the OCR and then seperately allowing Acrobat to optimize. The resultant file size is much smaller. I should mention that I do all this in a folder that I call "PDF holding" which I have elected as my default folder in all of my SnapScan profiles. After I have done all my tweaking, optimizing, page shuffling etc in this folder I save it as a "reduced size pdf" directly into my EN import folder. This will further reduce the size by about 10% or more and places it into the import folder which allows it to magically appear in EN. Therefore, it is my EN import folder which contains the final version of the pdf and not the "PDF holding" folder which is simply a transitional station. I generally delete most of the files that are there but I back up the files in the import folder.

#10 GHall

GHall

  • PunctPunct
  • Title: Alliance Lackey
  • Group: Members
  • 67 posturi

Postat 22 aprilie 2012 - 11:08


My work flow includes the use of the following tools:

  • Fujitsu ScanSnap S1500M
  • Adobe Acrobat 9 Pro for Mac (included with scanner)
  • Evernote (free version)
1. Scan to high quality PDF
2. OCR in Acrobat Pro
3. Optimize in Acrobat Pro
4. Save in a single folder on local hard drive titled something like "OCR & Evernote"
5. Copy to Evernote and sync

File sizes are 1/10 or smaller of the original size and the Acrobat OCR is superior to ABBYY FineReader for Mac and PDF OCR X per my trials. Also, Acrobat 9 Pro for Mac works great on my Mac running Lion (10.7.3). I also like to see the archived PDF's in the original color schemes so I scan using "auto color detection".

Currently I use the free version of Evernote, though despite my initial reservation, Evernote functions great, so I'll likely upgrade to the "yearly" option. I initially hesitated to install Acrobat 9 Pro on my Mac due to the large amount of negative net chatter that I read. Instead I spent hours trying out various other OCR options. In the end, I installed Acrobat 9 Pro and found it works better than the others that I tried. It's a robust program and comes free with the ScanSnap S1500M.

One of the ways Acrobat 9 Pro is better than others that I've tried is its ability to OCR documents with multiple and inconsistent formatting. For example, some of my utility bills have headers with typical "to" and "from" info, usage charts that run the width of the document, narrow columns on one side of the document, additional tables of varying columns and rows and then paragraphs. This is all on one standard letter size document. Acrobat 9 Pro OCR'd 99% of the text correctly, including "$", "#", "@", and " " (spaces). I can not say the same for the other programs. The odd thing is I don't really want to like Adobe Acrobat 9 Pro. I want to prefer a program developed by a smaller, leaner competitor.



It should be noted that if you run the PDF optimizer on Acrobat it will automatically OCR for you at the same time that it optimizes. Furthermore, I've discovered that this combined process is a lot faster than having snapscan do the OCR and then seperately allowing Acrobat to optimize. The resultant file size is much smaller. I should mention that I do all this in a folder that I call "PDF holding" which I have elected as my default folder in all of my SnapScan profiles. After I have done all my tweaking, optimizing, page shuffling etc in this folder I save it as a "reduced size pdf" directly into my EN import folder. This will further reduce the size by about 10% or more and places it into the import folder which allows it to magically appear in EN. Therefore, it is my EN import folder which contains the final version of the pdf and not the "PDF holding" folder which is simply a transitional station. I generally delete most of the files that are there but I back up the files in the import folder.


You do something similar to me.

You're right about the OCR/optimization being run in the same process.

I first scan all my documents to a folder I call "ScanSnap Temp". I manually rename every file using an intuitive naming scheme that looks something like this:

date - tag - tag - tag...

The "tag" is really a key word or key words. The date is in this format "yyyy.mm.dd".

After I rename all the files, I then choose the option in Adobe Acrobat 9 Pro to OCR all documents in the folder "ScanSnap Temp". My OCR is automatically set to do the following:

1. OCR each PDF
2. Optimize each PDF, including reduce file size
3. Place completed file in a folder called "Optimized"

Right now I manually drag my files from "Optimized" into EN. I then move the files from the "Optimized" folder into a folder called "OCR & IN EVERNOTE". That's right, I currently keep a separate copy of the PDF in this folder on my hard drive. At some point, I plan to delete this folder and to keep only my data in EN. For now, this is a safety measure until I resolve all issues, etc.

I'm also a very young EN user, as I started about seven weeks ago. Though, I did go premium after my first month.

#11 baef47

baef47

  • Punct
  • Title: Member
  • Group: Members
  • 1 posturi

Postat 08 octombrie 2012 - 11:53

Using ScanSnap 1500M I have been incorporating bundled Adobe Acrobat 9 Pro into my workflow. I have been scanning documents - then choosing 'Optimize Scanned PDF'.

Things were travelling along happily until today when I discovered that my documents are being scrambled in the 'Optimize' process.

A post on another forum describes the problem - it is like pages in a fax have overlapped. Another example is any blanks at the bottom of pages are filled in by content from the top of the page

Unlike others on this forum it seems that 'Optimize Scanned PDF' is not doing OCR for me. Using the OCR text recognition tool cause the same issue.

Without close examination it is difficult to pick up the bizarre damage caused to the digital document.

I now have the difficult task of going back and determining when this damage to my digital archive commenced.

I have used Adobe's Acrobat Uninstaller tool and reinstalled the program.

I am running OS X 10.8.2 Mountain Lion.

Has anybody had similar issues? Any help would be appreciated?





1 utilizatori citesc acest topic

0 membrii, 1 vizitatori, 0 utilizatori anonimi

Clip to Evernote