Jump to content

PDF or PNG


Recommended Posts

Anyone know why Evernote scans end up as individual png pages? Surely in this day an age when you scan a document, logic dictates all the pages remain in the same document in a format that is designed for text not images.

That's it.

Link to comment
  • Level 5*
22 minutes ago, jmshrrsn said:

Anyone know why Evernote scans end up as individual png pages?

Hi.  Because you need to change app or a setting somewhere?  "Logic" has no part in this.  If you're scanning with the mobile app,  check the image thumbnail for options to save as document / image etc - but PDF is not (AFAIK) available.  Other third-party apps like Adobe scan and Office mobile will save to PDF files which you can then attach to notes.  If you're using a desk scanner,  check the settings there for options to save as images or one/ multi-file PDFs.

Link to comment
  • Level 5*
2 hours ago, jmshrrsn said:

Anyone know why Evernote scans end up as individual png pages?

I use Evernote's Scannable app with an iPhone/iPad
Option of individual image files or multi-page pdfs

Other scanner apps provide the same option

  • Like 2
Link to comment
  • Level 5

EN organizes the scans in the notes. The note works as container for the scans. Remember, it is not build as a multipurpose scanner, it is build into EN to support its features.

In addition EN will only OCR handwriting in picture files (PNG JPG GIF). Scanning in this format means handwritten notes converted by the scanner will get indexed.

If you want to generate PDFs as an option, you can install the Scannable app provided for free by EN.

Link to comment

PDF can contain images and text. If you scan a page and the scanner outputs a PDF this will just be a container for an image file. It won't contain text. 

The only advantage of having a PDF is that it can contain multiple pages (each of which would be a photo of the page).

To read the text in an image Evernote (and Google notes for example, but I imagine lots of others do this too) use a technology called OCR (optical character recognition). Nowadays this can be done accurately and quickly by training a convolutional neural network. Just look at pretty much any tutorial about neural networks to find out more (it's a sort of "hello world" exercise in this field).

I've never tried to see if Evernote also runs an OCR over images stored in PDF files. @PinkElephant's answer would suggest that this isn't the case, which is a pity 😕

(Btw. Adobe often does some things resembling magic on images, so it's quite possible that they're very good at extracting text from scans and creating a PDF with text in it. However I'm not sure they'd also generate a font for every image)

Link to comment
  • Level 5

There are help documents about what EN does in the OCR field.

First, for me OCR means I get a text layer, and I can extract the text. Not so with EN, all OCRing they do is to build the search index. The text is not placed in a text layer. Second the OCR result is never exported. Something searchable in EN becomes a dull picture file (or a pdf with a dull picture inside) once exported from EN. 

Luckily today there are a lot of programs on which you can throw the picture file, and they will extract the text nicely. No need to train an AI yourself.

What does EN do ? They OCR (EN style) the text from any pdf, but no handwriting. They OCR any text including handwriting from picture files. It is done in a server process, which means it may take a little before a new scan becomes searchable.

About pdf indexing:   https://help.evernote.com/hc/en-us/articles/208313388

About picture indexing:    https://help.evernote.com/hc/en-us/articles/208314518

  • Like 1
Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...