Fujitsu Scansnap: possible to add OCR later? How?
#1
Posted 10 February 2012 - 04:17 PM
As there are quite a lot of Fujitsu scansnap users on this forum, I hope somebody can answer my question.
I use my Scansnap S1300 to scan documents into PDF's into Evernote both with and without adding OCR. The advantage of adding OCR are evident but for large piles of paper it takes quite some extra time. Often, I doubt whether I really need OCR for a particular document. It would be very helpful if there was a way to add OCR at a later date when it turns out that I indeed want to search the text or copy paste fragments of it. Does anybody know whether this is possible and how? Many thanks.
#2
Posted 10 February 2012 - 04:28 PM
If you are on a Mac, you can drag the PDF onto the ABBYY FineReader icon in the Finder. I am on my iPad so this is from memory, but I believe it is under /Applications/ScanSnap/Scan to Searchable PDF or something like that.
If you are on Windows, I don't think that works, but what you can do is throw your ScanSnap-scanned PDFs into the ScanSnap Organizer software that came with your scanner. It can then OCR them for you.
There are of course a zillion other applications that will do this for you, but I am just pointing out some ways to do it with the software that came with your scanner.
http://www.documentsnap.com
#3
Posted 12 February 2012 - 05:19 PM
bduncan, on 10 February 2012 - 04:28 PM, said:
If you are on a Mac, you can drag the PDF onto the ABBYY FineReader icon in the Finder. I am on my iPad so this is from memory, but I believe it is under /Applications/ScanSnap/Scan to Searchable PDF or something like that.
If you are on Windows, I don't think that works, but what you can do is throw your ScanSnap-scanned PDFs into the ScanSnap Organizer software that came with your scanner. It can then OCR them for you.
There are of course a zillion other applications that will do this for you, but I am just pointing out some ways to do it with the software that came with your scanner.
#4
Posted 14 February 2012 - 01:01 PM
bduncan, on 10 February 2012 - 04:28 PM, said:
This was the magic nugget... thanks.
When I bought my ScanSnap it came with an Adobe product which will OCR also, but I've not yet tried it out fully. I know it can work, but I haven't tried saving the results to see if it embeds the OCR'd info in the pdf or not.
#5
Posted 15 March 2012 - 09:18 PM
That's what I do.
#6
Posted 22 March 2012 - 12:30 AM
kennychua, on 15 March 2012 - 09:18 PM, said:
That's what I do.
So I understand your process, the PDF is scanned into an Evernote note first; and then you right click on the PDF located in the Evernote program and click "open with Adobe Acrobat"?
#7
Posted 22 March 2012 - 12:37 AM
- Fujitsu ScanSnap S1500M
- Adobe Acrobat 9 Pro for Mac (included with scanner)
- Evernote (free version)
2. OCR in Acrobat Pro
3. Optimize in Acrobat Pro
4. Save in a single folder on local hard drive titled something like "OCR & Evernote"
5. Copy to Evernote and sync
File sizes are 1/10 or smaller of the original size and the Acrobat OCR is superior to ABBYY FineReader for Mac and PDF OCR X per my trials. Also, Acrobat 9 Pro for Mac works great on my Mac running Lion (10.7.3). I also like to see the archived PDF's in the original color schemes so I scan using "auto color detection".
Currently I use the free version of Evernote, though despite my initial reservation, Evernote functions great, so I'll likely upgrade to the "yearly" option. I initially hesitated to install Acrobat 9 Pro on my Mac due to the large amount of negative net chatter that I read. Instead I spent hours trying out various other OCR options. In the end, I installed Acrobat 9 Pro and found it works better than the others that I tried. It's a robust program and comes free with the ScanSnap S1500M.
One of the ways Acrobat 9 Pro is better than others that I've tried is its ability to OCR documents with multiple and inconsistent formatting. For example, some of my utility bills have headers with typical "to" and "from" info, usage charts that run the width of the document, narrow columns on one side of the document, additional tables of varying columns and rows and then paragraphs. This is all on one standard letter size document. Acrobat 9 Pro OCR'd 99% of the text correctly, including "$", "#", "@", and " " (spaces). I can not say the same for the other programs. The odd thing is I don't really want to like Adobe Acrobat 9 Pro. I want to prefer a program developed by a smaller, leaner competitor.
#8
Posted 22 March 2012 - 01:38 PM
#9
Posted 21 April 2012 - 05:31 AM
GHall, on 22 March 2012 - 12:37 AM, said:
- Fujitsu ScanSnap S1500M
- Adobe Acrobat 9 Pro for Mac (included with scanner)
- Evernote (free version)
2. OCR in Acrobat Pro
3. Optimize in Acrobat Pro
4. Save in a single folder on local hard drive titled something like "OCR & Evernote"
5. Copy to Evernote and sync
File sizes are 1/10 or smaller of the original size and the Acrobat OCR is superior to ABBYY FineReader for Mac and PDF OCR X per my trials. Also, Acrobat 9 Pro for Mac works great on my Mac running Lion (10.7.3). I also like to see the archived PDF's in the original color schemes so I scan using "auto color detection".
Currently I use the free version of Evernote, though despite my initial reservation, Evernote functions great, so I'll likely upgrade to the "yearly" option. I initially hesitated to install Acrobat 9 Pro on my Mac due to the large amount of negative net chatter that I read. Instead I spent hours trying out various other OCR options. In the end, I installed Acrobat 9 Pro and found it works better than the others that I tried. It's a robust program and comes free with the ScanSnap S1500M.
One of the ways Acrobat 9 Pro is better than others that I've tried is its ability to OCR documents with multiple and inconsistent formatting. For example, some of my utility bills have headers with typical "to" and "from" info, usage charts that run the width of the document, narrow columns on one side of the document, additional tables of varying columns and rows and then paragraphs. This is all on one standard letter size document. Acrobat 9 Pro OCR'd 99% of the text correctly, including "$", "#", "@", and " " (spaces). I can not say the same for the other programs. The odd thing is I don't really want to like Adobe Acrobat 9 Pro. I want to prefer a program developed by a smaller, leaner competitor.
It should be noted that if you run the PDF optimizer on Acrobat it will automatically OCR for you at the same time that it optimizes. Furthermore, I've discovered that this combined process is a lot faster than having snapscan do the OCR and then seperately allowing Acrobat to optimize. The resultant file size is much smaller. I should mention that I do all this in a folder that I call "PDF holding" which I have elected as my default folder in all of my SnapScan profiles. After I have done all my tweaking, optimizing, page shuffling etc in this folder I save it as a "reduced size pdf" directly into my EN import folder. This will further reduce the size by about 10% or more and places it into the import folder which allows it to magically appear in EN. Therefore, it is my EN import folder which contains the final version of the pdf and not the "PDF holding" folder which is simply a transitional station. I generally delete most of the files that are there but I back up the files in the import folder.
#10
Posted 22 April 2012 - 11:08 AM
idoc, on 21 April 2012 - 05:31 AM, said:
GHall, on 22 March 2012 - 12:37 AM, said:
- Fujitsu ScanSnap S1500M
- Adobe Acrobat 9 Pro for Mac (included with scanner)
- Evernote (free version)
2. OCR in Acrobat Pro
3. Optimize in Acrobat Pro
4. Save in a single folder on local hard drive titled something like "OCR & Evernote"
5. Copy to Evernote and sync
File sizes are 1/10 or smaller of the original size and the Acrobat OCR is superior to ABBYY FineReader for Mac and PDF OCR X per my trials. Also, Acrobat 9 Pro for Mac works great on my Mac running Lion (10.7.3). I also like to see the archived PDF's in the original color schemes so I scan using "auto color detection".
Currently I use the free version of Evernote, though despite my initial reservation, Evernote functions great, so I'll likely upgrade to the "yearly" option. I initially hesitated to install Acrobat 9 Pro on my Mac due to the large amount of negative net chatter that I read. Instead I spent hours trying out various other OCR options. In the end, I installed Acrobat 9 Pro and found it works better than the others that I tried. It's a robust program and comes free with the ScanSnap S1500M.
One of the ways Acrobat 9 Pro is better than others that I've tried is its ability to OCR documents with multiple and inconsistent formatting. For example, some of my utility bills have headers with typical "to" and "from" info, usage charts that run the width of the document, narrow columns on one side of the document, additional tables of varying columns and rows and then paragraphs. This is all on one standard letter size document. Acrobat 9 Pro OCR'd 99% of the text correctly, including "$", "#", "@", and " " (spaces). I can not say the same for the other programs. The odd thing is I don't really want to like Adobe Acrobat 9 Pro. I want to prefer a program developed by a smaller, leaner competitor.
It should be noted that if you run the PDF optimizer on Acrobat it will automatically OCR for you at the same time that it optimizes. Furthermore, I've discovered that this combined process is a lot faster than having snapscan do the OCR and then seperately allowing Acrobat to optimize. The resultant file size is much smaller. I should mention that I do all this in a folder that I call "PDF holding" which I have elected as my default folder in all of my SnapScan profiles. After I have done all my tweaking, optimizing, page shuffling etc in this folder I save it as a "reduced size pdf" directly into my EN import folder. This will further reduce the size by about 10% or more and places it into the import folder which allows it to magically appear in EN. Therefore, it is my EN import folder which contains the final version of the pdf and not the "PDF holding" folder which is simply a transitional station. I generally delete most of the files that are there but I back up the files in the import folder.
You do something similar to me.
You're right about the OCR/optimization being run in the same process.
I first scan all my documents to a folder I call "ScanSnap Temp". I manually rename every file using an intuitive naming scheme that looks something like this:
date - tag - tag - tag...
The "tag" is really a key word or key words. The date is in this format "yyyy.mm.dd".
After I rename all the files, I then choose the option in Adobe Acrobat 9 Pro to OCR all documents in the folder "ScanSnap Temp". My OCR is automatically set to do the following:
1. OCR each PDF
2. Optimize each PDF, including reduce file size
3. Place completed file in a folder called "Optimized"
Right now I manually drag my files from "Optimized" into EN. I then move the files from the "Optimized" folder into a folder called "OCR & IN EVERNOTE". That's right, I currently keep a separate copy of the PDF in this folder on my hard drive. At some point, I plan to delete this folder and to keep only my data in EN. For now, this is a safety measure until I resolve all issues, etc.
I'm also a very young EN user, as I started about seven weeks ago. Though, I did go premium after my first month.
1 user(s) are reading this topic
0 members, 1 guests, 0 anonymous users











