Fujitsu Scansnap: possible to add OCR later? How?
#1
Postat 10 februarie 2012 - 04:17
As there are quite a lot of Fujitsu scansnap users on this forum, I hope somebody can answer my question.
I use my Scansnap S1300 to scan documents into PDF's into Evernote both with and without adding OCR. The advantage of adding OCR are evident but for large piles of paper it takes quite some extra time. Often, I doubt whether I really need OCR for a particular document. It would be very helpful if there was a way to add OCR at a later date when it turns out that I indeed want to search the text or copy paste fragments of it. Does anybody know whether this is possible and how? Many thanks.
#2
Postat 10 februarie 2012 - 04:28
If you are on a Mac, you can drag the PDF onto the ABBYY FineReader icon in the Finder. I am on my iPad so this is from memory, but I believe it is under /Applications/ScanSnap/Scan to Searchable PDF or something like that.
If you are on Windows, I don't think that works, but what you can do is throw your ScanSnap-scanned PDFs into the ScanSnap Organizer software that came with your scanner. It can then OCR them for you.
There are of course a zillion other applications that will do this for you, but I am just pointing out some ways to do it with the software that came with your scanner.
http://www.documentsnap.com
#3
Postat 12 februarie 2012 - 05:19
Thanks for your competent answer. This was helpful to me.So given where we are, I'd be remiss if i didn't mention that Evernote Premium will OCR your PDFs for you when you upload them, but I assume what you are wanting is to have the actual PDF OCRed independently of Evernote.
If you are on a Mac, you can drag the PDF onto the ABBYY FineReader icon in the Finder. I am on my iPad so this is from memory, but I believe it is under /Applications/ScanSnap/Scan to Searchable PDF or something like that.
If you are on Windows, I don't think that works, but what you can do is throw your ScanSnap-scanned PDFs into the ScanSnap Organizer software that came with your scanner. It can then OCR them for you.
There are of course a zillion other applications that will do this for you, but I am just pointing out some ways to do it with the software that came with your scanner.
#4
Postat 14 februarie 2012 - 01:01
If you are on Windows, I don't think that works, but what you can do is throw your ScanSnap-scanned PDFs into the ScanSnap Organizer software that came with your scanner. It can then OCR them for you.
This was the magic nugget... thanks.
When I bought my ScanSnap it came with an Adobe product which will OCR also, but I've not yet tried it out fully. I know it can work, but I haven't tried saving the results to see if it embeds the OCR'd info in the pdf or not.
#5
Postat 15 martie 2012 - 09:18
That's what I do.
#6
Postat 22 martie 2012 - 12:30
So, I agree that Scanning with ScanSnap and letting it OCR at the same time takes awhile. However, after scanning your pdf into a note, right click on the note and open with Adobe Acrobat. Run the OCR command, and close the note and it will automatically save the OCR'ed version over the regular version in your note.
That's what I do.
So I understand your process, the PDF is scanned into an Evernote note first; and then you right click on the PDF located in the Evernote program and click "open with Adobe Acrobat"?
#7
Postat 22 martie 2012 - 12:37
- Fujitsu ScanSnap S1500M
- Adobe Acrobat 9 Pro for Mac (included with scanner)
- Evernote (free version)
2. OCR in Acrobat Pro
3. Optimize in Acrobat Pro
4. Save in a single folder on local hard drive titled something like "OCR & Evernote"
5. Copy to Evernote and sync
File sizes are 1/10 or smaller of the original size and the Acrobat OCR is superior to ABBYY FineReader for Mac and PDF OCR X per my trials. Also, Acrobat 9 Pro for Mac works great on my Mac running Lion (10.7.3). I also like to see the archived PDF's in the original color schemes so I scan using "auto color detection".
Currently I use the free version of Evernote, though despite my initial reservation, Evernote functions great, so I'll likely upgrade to the "yearly" option. I initially hesitated to install Acrobat 9 Pro on my Mac due to the large amount of negative net chatter that I read. Instead I spent hours trying out various other OCR options. In the end, I installed Acrobat 9 Pro and found it works better than the others that I tried. It's a robust program and comes free with the ScanSnap S1500M.
One of the ways Acrobat 9 Pro is better than others that I've tried is its ability to OCR documents with multiple and inconsistent formatting. For example, some of my utility bills have headers with typical "to" and "from" info, usage charts that run the width of the document, narrow columns on one side of the document, additional tables of varying columns and rows and then paragraphs. This is all on one standard letter size document. Acrobat 9 Pro OCR'd 99% of the text correctly, including "$", "#", "@", and " " (spaces). I can not say the same for the other programs. The odd thing is I don't really want to like Adobe Acrobat 9 Pro. I want to prefer a program developed by a smaller, leaner competitor.
#8
Postat 22 martie 2012 - 01:38
#9
Postat 21 aprilie 2012 - 05:31
My work flow includes the use of the following tools:
1. Scan to high quality PDF
- Fujitsu ScanSnap S1500M
- Adobe Acrobat 9 Pro for Mac (included with scanner)
- Evernote (free version)
2. OCR in Acrobat Pro
3. Optimize in Acrobat Pro
4. Save in a single folder on local hard drive titled something like "OCR & Evernote"
5. Copy to Evernote and sync
File sizes are 1/10 or smaller of the original size and the Acrobat OCR is superior to ABBYY FineReader for Mac and PDF OCR X per my trials. Also, Acrobat 9 Pro for Mac works great on my Mac running Lion (10.7.3). I also like to see the archived PDF's in the original color schemes so I scan using "auto color detection".
Currently I use the free version of Evernote, though despite my initial reservation, Evernote functions great, so I'll likely upgrade to the "yearly" option. I initially hesitated to install Acrobat 9 Pro on my Mac due to the large amount of negative net chatter that I read. Instead I spent hours trying out various other OCR options. In the end, I installed Acrobat 9 Pro and found it works better than the others that I tried. It's a robust program and comes free with the ScanSnap S1500M.
One of the ways Acrobat 9 Pro is better than others that I've tried is its ability to OCR documents with multiple and inconsistent formatting. For example, some of my utility bills have headers with typical "to" and "from" info, usage charts that run the width of the document, narrow columns on one side of the document, additional tables of varying columns and rows and then paragraphs. This is all on one standard letter size document. Acrobat 9 Pro OCR'd 99% of the text correctly, including "$", "#", "@", and " " (spaces). I can not say the same for the other programs. The odd thing is I don't really want to like Adobe Acrobat 9 Pro. I want to prefer a program developed by a smaller, leaner competitor.
It should be noted that if you run the PDF optimizer on Acrobat it will automatically OCR for you at the same time that it optimizes. Furthermore, I've discovered that this combined process is a lot faster than having snapscan do the OCR and then seperately allowing Acrobat to optimize. The resultant file size is much smaller. I should mention that I do all this in a folder that I call "PDF holding" which I have elected as my default folder in all of my SnapScan profiles. After I have done all my tweaking, optimizing, page shuffling etc in this folder I save it as a "reduced size pdf" directly into my EN import folder. This will further reduce the size by about 10% or more and places it into the import folder which allows it to magically appear in EN. Therefore, it is my EN import folder which contains the final version of the pdf and not the "PDF holding" folder which is simply a transitional station. I generally delete most of the files that are there but I back up the files in the import folder.
#10
Postat 22 aprilie 2012 - 11:08
My work flow includes the use of the following tools:1. Scan to high quality PDF
- Fujitsu ScanSnap S1500M
- Adobe Acrobat 9 Pro for Mac (included with scanner)
- Evernote (free version)
2. OCR in Acrobat Pro
3. Optimize in Acrobat Pro
4. Save in a single folder on local hard drive titled something like "OCR & Evernote"
5. Copy to Evernote and sync
File sizes are 1/10 or smaller of the original size and the Acrobat OCR is superior to ABBYY FineReader for Mac and PDF OCR X per my trials. Also, Acrobat 9 Pro for Mac works great on my Mac running Lion (10.7.3). I also like to see the archived PDF's in the original color schemes so I scan using "auto color detection".
Currently I use the free version of Evernote, though despite my initial reservation, Evernote functions great, so I'll likely upgrade to the "yearly" option. I initially hesitated to install Acrobat 9 Pro on my Mac due to the large amount of negative net chatter that I read. Instead I spent hours trying out various other OCR options. In the end, I installed Acrobat 9 Pro and found it works better than the others that I tried. It's a robust program and comes free with the ScanSnap S1500M.
One of the ways Acrobat 9 Pro is better than others that I've tried is its ability to OCR documents with multiple and inconsistent formatting. For example, some of my utility bills have headers with typical "to" and "from" info, usage charts that run the width of the document, narrow columns on one side of the document, additional tables of varying columns and rows and then paragraphs. This is all on one standard letter size document. Acrobat 9 Pro OCR'd 99% of the text correctly, including "$", "#", "@", and " " (spaces). I can not say the same for the other programs. The odd thing is I don't really want to like Adobe Acrobat 9 Pro. I want to prefer a program developed by a smaller, leaner competitor.
It should be noted that if you run the PDF optimizer on Acrobat it will automatically OCR for you at the same time that it optimizes. Furthermore, I've discovered that this combined process is a lot faster than having snapscan do the OCR and then seperately allowing Acrobat to optimize. The resultant file size is much smaller. I should mention that I do all this in a folder that I call "PDF holding" which I have elected as my default folder in all of my SnapScan profiles. After I have done all my tweaking, optimizing, page shuffling etc in this folder I save it as a "reduced size pdf" directly into my EN import folder. This will further reduce the size by about 10% or more and places it into the import folder which allows it to magically appear in EN. Therefore, it is my EN import folder which contains the final version of the pdf and not the "PDF holding" folder which is simply a transitional station. I generally delete most of the files that are there but I back up the files in the import folder.
You do something similar to me.
You're right about the OCR/optimization being run in the same process.
I first scan all my documents to a folder I call "ScanSnap Temp". I manually rename every file using an intuitive naming scheme that looks something like this:
date - tag - tag - tag...
The "tag" is really a key word or key words. The date is in this format "yyyy.mm.dd".
After I rename all the files, I then choose the option in Adobe Acrobat 9 Pro to OCR all documents in the folder "ScanSnap Temp". My OCR is automatically set to do the following:
1. OCR each PDF
2. Optimize each PDF, including reduce file size
3. Place completed file in a folder called "Optimized"
Right now I manually drag my files from "Optimized" into EN. I then move the files from the "Optimized" folder into a folder called "OCR & IN EVERNOTE". That's right, I currently keep a separate copy of the PDF in this folder on my hard drive. At some point, I plan to delete this folder and to keep only my data in EN. For now, this is a safety measure until I resolve all issues, etc.
I'm also a very young EN user, as I started about seven weeks ago. Though, I did go premium after my first month.
#11
Postat 08 octombrie 2012 - 11:53
Things were travelling along happily until today when I discovered that my documents are being scrambled in the 'Optimize' process.
A post on another forum describes the problem - it is like pages in a fax have overlapped. Another example is any blanks at the bottom of pages are filled in by content from the top of the page
Unlike others on this forum it seems that 'Optimize Scanned PDF' is not doing OCR for me. Using the OCR text recognition tool cause the same issue.
Without close examination it is difficult to pick up the bizarre damage caused to the digital document.
I now have the difficult task of going back and determining when this damage to my digital archive commenced.
I have used Adobe's Acrobat Uninstaller tool and reinstalled the program.
I am running OS X 10.8.2 Mountain Lion.
Has anybody had similar issues? Any help would be appreciated?
Also tagged with one or more of these keywords: paperless, fujitsu scansnap ocr
General Discussions →
Evernote General Discussion →
going paperless - best image format to store bills once scannedCreat de yoshiserry, 31 mai 2013 |
|
|
||
Paperless
Learn & Share →
Evernote Lifestyles →
For Paperless, Download Statements into Evernote or Just Leave On-LineCreat de aukirk, 25 mai 2013 |
|
|
||
Android
Evernote Products →
Evernote →
Doxie or Fujitsu or Other ScannerCreat de mswireman, 08 mai 2013 |
|
|
||
Paperless
Learn & Share →
Evernote Lifestyles →
Issue with ReadIris 14 Mac and save to evernoteCreat de Marc Teutelink, 26 apr 2013 |
|
|
||
Organization
Learn & Share →
Evernote Lifestyles →
Best scanner or other device for unusual documents (e.g. kids' art)Creat de Dorota, 09 apr 2013 |
|
|
1 utilizatori citesc acest topic
0 membrii, 1 vizitatori, 0 utilizatori anonimi












