Jump to content
dlrh

'1 PDF has not been indexed'

Recommended Posts

I'm a premium user, with a client on the Mac and on iPad.

I'm using Evernote for teaching, and have accumulated a nice trove of articles, many of which are PDF, both the image type and text type.

When I try and do a simple search for a term I know is in one of those PDFs, I come up empty every time.

My organizing is not the best. One note, for example, has a bunch of these PDF files collected together. When I select that note and get the info on it, I see:

29 PDF files have not been indexed

What am I missing here? Is there something I have to do to force them to BE indexed? None are over the size limit. The mix of image type and text type should, I think, allow for indexing all of the content of the PDFs.

Help appreciated.

Share this post


Link to post

Normally, they should be indexed pretty quickly for premium members. You did sync, right? :) For completeness' sake, these are Evernote's OCR criteria (copied from an old post by jbenson2):


  • The raw PDF is 25 megabytes or less.
    The scan contains no more than 100 pages.
    The raw PDF doesn't already contain "searchable" text that you can select and copy.
    The PDF isn't encrypted or protected with a passphrase.
    The PDF is not of a handwritten document.

For clarification, do you search for notes and no notes come up, or for text occurrences within notes, and no text is highlighted?

Sometimes with this kind of problem, a complete deinstall, reinstall and clean re-sync of Evernote helps (but first, make sure everything is synced and you have a local backup as well).

Share this post


Link to post

In answer to the reply above:

Yes, I have ensured that my uploaded PDFs fit the criteria - they all do. Except... I don't quite undertand the implications of "The raw PDF doesn't already contain "searchable" text that you can select and copy." That seems to suggest that these PDF files will not be searched?

As to search notes or text occurences within notes part - I believe I am. Here's what I do:

- To search my notes, I click on the left hand column area and enter the search term.

- To search for text occurrences within notes, I click on a given note and enter the search term.

When I do this, I get these results, with the following word examples.

Word: "Reconsidered" (I do not use the quotes)

In my notes, I know that I have the following PDF article that IS in text, 'Frame Analysis Reconsidered'. It's from JSTOR. I can open it up, see the title page and the title and if I click on that word, I can then paste it into another document. I think this means it is a word in the text.

When I search on 'Reconsidered' though, I get two, and only two, different documents, neither of which is the above. What's more, the second one of these is a PDF file, it is also text, but if I get the Info on that, it says '1 PDF file has not been indexed'.

So I am totally stumped on this.

Do I open a support request?

Share this post


Link to post

I tried to search for indexed text in a pdf today. What I found was that when I searched for text I knew to be in a document ON MY MAC - it didn't find anything. If I loged into the web version, and did the search - it found the document I was expecting. So it seems the indexing isn't getting passed back to the mac during sync.

(This was a pdf I scanned in from this mac, it was synced up to evernote's cloud and was indexed, but the ability to search the pdf text never made it back to my mac.)

As I understand it as a premium member, I should get indexed search of pdf's on the mac, not just on the web - is that correct?

Share this post


Link to post

Interestingly - now that I manually synced (under the File menu item) vs. the sync button or the auto sync that evernote always does when I add a note - the pdf's on my mac now are indexed and the search results are the same (mac vs web)

Share this post


Link to post

I have noticed all of my PDF'S files says " ((PDF Count) # PDF files have not been indexed " and my uploaded PDFs fit the EN criteria.

  • Like 1

Share this post


Link to post

I did a search and one time it came up finding it inside a PDF. I tried the same search and now it won't find it in the same pdf. This happen on my mac and on my iPhone.

Share this post


Link to post

That's strange. I'm using local OCR myself, so I can't verify whether this is a general problem. Maybe Evernote support can help.

Share this post


Link to post

That's strange. I'm using local OCR myself, so I can't verify whether this is a general problem. Maybe Evernote support can help.

Bushwhacker, How are you doing local OCR, Software? Which One?

Share this post


Link to post

Indeed, things are a bit odd!

Today, I find that using the desktop client on my Mac (not the web access), I *can* find a term in a PDF file. It's a bit quirky:

- The actual, visible "location" I see on my screen is typically *below* the actual text in the PDF file, but is spot on for any other, text-like file;

- If a PDF is set to 'View As Attachement', I lose the visual pointer completely once I open it up, so setting things to be 'View Inline' is the only way to get useful results.

To make matters more confusing, I find that a compound search gets me really weird results.

If I type, without quotes, this phrase into the search box:

feminist movement

I get a slew of hits, and the returns have either the term 'feminist' or 'movement'.

If I type, with quotes, this phrase into the search box:

"feminist movement"

I get two hits. The first is an html clipped article, and that has two visible hits: "feminist" and "movement" in two DIFFERENT places in the document. The second is a PDF file, and that is a single term (again with the odd visual offset - the actual found term is above what I am shown on the screen).

To be sure I am doing things correctly, I looked at the Evernote page on search grammar [ http://www.evernote.com/about/developer/api/evernote-api.htm#_Toc297053079 ] and found this (some trimmed for clarity here):

"C.1.2. Matching literal terms

If no advanced search modifier is found in a search term, it will be matched against the note as a text content search. Words or quoted phrases must exactly match a word or phrase in the note contents, note title, tag name, or recognition index. Words in the content of the note are split by whitespace or punctuation. [snip] E.g.:

· "San Francisco"

o matches: "The hills of San Francisco"

o does not match: "San Andreas fault near Francisco winery"

From this, I infer that my search for the phrase "feminist movement" should NOT have found one of my instances, but it did.

So I am still lost on this. I don't rely on search a lot, but it would be nice to have it consistent.

Any ideas? Anyone?

Share this post


Link to post
From this, I infer that my search for the phrase "feminist movement" should NOT have found one of my instances, but it did.

This sounds like a bug, and since you have what sounds like an easily repeatable test case, you should probably file a support inquiry. See the link in my signature.

Share this post


Link to post

That's strange. I'm using local OCR myself, so I can't verify whether this is a general problem. Maybe Evernote support can help.

Bushwhacker, How are you doing local OCR, Software? Which One?

Sorry for the late reply. I'm using the OCR facilities that came with my Fujitsu ScanSnap's software. This functionality is sort of embedded in the drivers/software suite, so I can't isolate it.

  • Like 1

Share this post


Link to post
From this, I infer that my search for the phrase "feminist movement" should NOT have found one of my instances, but it did.

This sounds like a bug, and since you have what sounds like an easily repeatable test case, you should probably file a support inquiry. See the link in my signature.

Well, I did issue a request for support and, sadly, all I got back was a repeat of the PDF specifications, i.e. size, not text, etc.

That kinda leaves me in the position of having to NOT rely on the search function for much at all. I've taken to trying to organize my notes so as to reduce the possible need TO search. This is rather disappointing.

And it still leaves me with the same bug, easily reproduced problem, that a quoted phrase simply does NOT result in accurate search results.

So, word to the wise: DO NOT EXPECT THE SEARCH FUNCTION TO YIELD USABLE RESULTS. Plan accordingly.

Share this post


Link to post

From the Mac's Mail client, I use the "Send PDF to Evernote" command to quickly get email messages stored in Evernote. This is a great feature and allows me to see important emails in Evernote on my Mac, PC and iOS devices.

Recently, I noticed that the PDFs were not being indexed. For every note that contains a PDF, when I click on the Information button on the right hand side, in the 'Attachment Status' row the message is always '1 PDF has not been indexed'. I thought it strange that this was the case given that I am a premium user.

I wrote to the support people telling them about the issue. I had read elsewhere that the indexing could take some time so I used a note that was created back in January 2012 as my example. Please see attached PDF screen shot.

One brilliant response from their supposed expert was "Please do File>Sync". What sort of solution is that? Does this supposed expert really believe that my database has not been synced since January? Get real and get another job.

Is anyone else having this issue on a Mac. I think the problem is the same on my PC but I don't really know how to get the information about the note in the detail that is displayed on the Mac.

Any help would be greatly appreciated.

Screen1.pdf

Share this post


Link to post

One brilliant response from their supposed expert was "Please do File>Sync". What sort of solution is that? Does this supposed expert really believe that my database has not been synced since January? Get real and get another job.

Nothing wrong with the question. If the file is in a non-sync'd notebook, it won't get OCR'd.

Evernote support has to start with basic fundamental issues first, then work up to the more complex issues.

And did you respond to their question?

Share this post


Link to post

I think this may be a problem with either the intent of the "Attachment Status" field, or with the actual setting of this field by Evernote.

My best guess is that this field reports ONLY on the status of PDF files that are only image-based, not those PDF files that are text-based.

After you submit an image-based PDF to Evernote, the Evernote server does an OCR of the PDF, and then creates an index of this OCR text. After the OCR/index process is completed, then Evernote updates your Note, changing the "Attachment Status" to "Indexed".

If you attach a text-based PDF, like those from your Mail app, then OCR is not necessary.

Evernote will still index the text in your PDF, it just doesn't not update this "Attachment Status" field.

You can test this by viewing the PDF in "inline" mode, and then entering text in the Search block that is in the PDF.

Evernote should highlight the text in your PDF.

IMO, Evernote needs to change this Attachment Status field:

  1. Rename it to OCR/Index Status
    OR
  2. Set the field immediately to "Indexed" if the PDF is text-based.

Perhaps someone from Evernote can confirm or correct my guess about how this field is set/used.

  • Like 1

Share this post


Link to post

One brilliant response from their supposed expert was "Please do File>Sync". What sort of solution is that? Does this supposed expert really believe that my database has not been synced since January? Get real and get another job.

Nothing wrong with the question. If the file is in a non-sync'd notebook, it won't get OCR'd.

Evernote support has to start with basic fundamental issues first, then work up to the more complex issues.

And did you respond to their question?

Thanks for the reply. Yes, I did reply and I understand what you are saying. However, given that it takes 24 hours to get a reply, I feel that the support person should have gone on to state other possible solutions.

Share this post


Link to post

Thanks for the reply. Yes, I did reply and I understand what you are saying. However, given that it takes 24 hours to get a reply, I feel that the support person should have gone on to state other possible solutions.

I agree that the Evernote CSR could have/should have gone further in their initial response.

The CSR could have said, "Please make sure you have sync'd with the Evernote cloud after you attached the PDF. If you have sync'd then the problem could be . . ."

Since by default Sync is automatic, asking if you have sync'd is like asking have you powered up your computer. ;)

I would think that by now this is a fairly common FAQ that that Evernote would have a prepared response that covers most causes of failure to index.

Share this post


Link to post

Thanks JMichael. I thought that might be the case. I changed the PDF to inline and Voila! I did a search for the word 'Titan' which I know is in the pdf and it highlighted three instances. Thanks very much.

This of course brings up another issue. If I have to search for the word 'Titan' and it appears in multiple PDFs that I have in my notes, I won't find the term unless the PFDs are in inline mode. Is there a way for the search to occur in the PDFs without having to have all of them in inline mode?

For example, every time I send a PDF to Evernote from Mail, I would have to select it and make it inline each and every time, so that I could search the PDFs some time in the future. This I think defeats the usefulness of the search function.

Do you know if this is a feature that is been looked at for inclusion in the future?

Thanks again for your reply. You did much better than tech support!

Share this post


Link to post

I don't think you have to show the PDF inline in order for the Search to work.

The Search will still find the Note, but you will not be able to see the found text because it is not inline.

After the Note is found you can then either view the PDF inline, or open it up in Preview (or other PDF reader) and search for the text.

Share this post


Link to post

I don't think you have to show the PDF inline in order for the Search to work.

The Search will still find the Note, but you will not be able to see the found text because it is not inline.

After the Note is found you can then either view the PDF inline, or open it up in Preview (or other PDF reader) and search for the text.

I thought that might be the case. So it looks like I have to leave the PDFs inline or depend heavily on tags.

I hope the search function is extended in the future so that it can find words in PDFs that are not inline.

Thanks for all your help JMichael. I now don't have to wait 24 hours to hear back from support.

Cheers. and thanks again.

Share this post


Link to post

I don't think you have to show the PDF inline in order for the Search to work.

The Search will still find the Note, but you will not be able to see the found text because it is not inline.

After the Note is found you can then either view the PDF inline, or open it up in Preview (or other PDF reader) and search for the text.

I thought that might be the case. So it looks like I have to leave the PDFs inline or depend heavily on tags.

I hope the search function is extended in the future so that it can find words in PDFs that are not inline.

No, this is incorrect. Please let me clarify my previous remarks.

You do NOT have to view the PDF inline in order for the Search to find the Note that contains the PDF with the text.

The Search *will find* the Note. You will just have to open the PDF (either inline or by external app) to view the found text.

Share this post


Link to post

Hi there,

 

If you press the i icon in a note you can see some metadata from a note. In my current client (evernote 5.4.3 on mac) all note info with a pdf attachment show the text '1 PDF has not been indexed'.

 

This is just not true.

 

Do more people have this? I think this is a bug.

 

Regards,

 

Bram Heerink

Share this post


Link to post

same here... i clip a PDF file and it is never uploaded - metadata says "PDF not indexed" but a preview picture is created.

 

Upload takes some time and upload volume is counting down so i assume the file IS uploaded. But it never appears in my client or the web interface or any other device.

 

reinstalled from scratch, same problem - this IS a bug. But the support team is just telling me my OS is too old (i still have to run SL for some Rosetta Apps :/ )

 

Does anyone have 5.4.2 still available?

 

Mike

Share this post


Link to post

same here... i clip a PDF file and it is never uploaded - metadata says "PDF not indexed" but a preview picture is created.

 

Upload takes some time and upload volume is counting down so i assume the file IS uploaded. But it never appears in my client or the web interface or any other device.

 

reinstalled from scratch, same problem - this IS a bug. But the support team is just telling me my OS is too old (i still have to run SL for some Rosetta Apps :/ )

 

Does anyone have 5.4.2 still available?

 

Mike

 

This is not my problem. I do not have a problem with uploading and indexing of my PDF files. But the Note info message about indexed state is just wrong.

 

Regards,

 

bram

Share this post


Link to post

which OS are you on? the "nothing is uploaded" part seems to be limited to SnowLeopard.

Share this post


Link to post

While this may appear to be derogatory, it is meant to be constructive.  Sometimes

you have the shake the tree.

 

I must say, I very much appreciate the "live chat" feature that is available to

Premium users.  I think that, without it, I would have abandoned the product

out of hopelessness.

 

Yet is really a very sad state of affairs, this message thread regarding

PDF has not been indexed.  It is an instance of a much,

much bigger problem and shows many symptoms of a

lack of leadership by the company.

 

First, I cannot, for the life of me, understand why Evernote does not carefully examine 

the questions and attempted answers in such message threads as this, that is, threads

which concern users being unable to get Evernote to behave as it is advertised to do,

or as users expect that it should,

and why Evernote fails to take and compile the solution into a new FAQ,

and provide a clear link to the FAQ.

 

Second, you have users, taking time out of their undoubtedly busy lives,

trying with unfailing politeness and courtesy to adequately

describe a problem they are having; and submitting it in desperate hope

that they will obtain a quick and definitive answer to what to them

may seem an question that should have been answered,

and that should previously have been documented in some way,

and to which they should not have to struggle

to get an answer.

 

Third, who among us, who is experiencing the same problem described,

has the time to read such a lengthy message thread?   That is, a message thread

containing twenty-seven responses; in order to try to solve, or to identify the solution

for what is apparently a relatively widespread problem?

 

Fourth, I am confident that I am not the only customer who finds themselves thinking,

"I aspire to be a user!"  I am becoming tired and discouraged about this product,

notwithstanding its remarkable benefits, when I find myself often stuck, unable to get answers.

Or, am I the only one who finds this software to be superb for capture yet 

confusing to use consistently and systematically for the compilation of information

into actionable knowledge?

 

Fifth, IMO underlying these phenomena,

these message threads about user problems, is a

lack of leadership by the company.  

Someone at the senior level needs to step up and say,

we are not merely going to use the discussion forum

as a method of inexpensive support, where

customers (who out of the goodness of their hearts)

help each other, and then occasionally

an Evernote employee will chime in with the 

"definitively correct answer".  (BTW: Where is the list of 

all the problems solved?)

 

Sixth, rather than use/exploit customer's goodwill in this manner,

someone at the senior level needs to step up and say, 

we are going to systematically use the discussion forums, in order to systematically

improve the product and its documentation (including but not limited to FAQs)

and measurably, continuously improve the features, 

functions, interface, and solution performance.

Or is software development basically still like the wild, wild West?

 

And what is it, with that brand new web interface that is so lean and spare that many of the

familiar controls cannot be found, are missing in action?  Whose awful dream was that?

It has all the signs of a mistaken diagnosis of some interface problems,

coupled with a "fashionable" solution.  

It's as though someone drank too much of

the de-skeudopmorphic punch.

That lean and spare interface has been an utter disaster.

 

I just don't get it.  I can only imagine that the company has entrained itself into 

a narrow furrow and cannot see over the edges.

 

Undoubtedly there will be responses that will treat my

critique as though I'm failing to read the forum closely enough.

 

I do not wish to, nor am I able to.

Such an answer would fail to recognize that abandoning the product

may be a real choice for some people, and easier than having to be my own

tech support.  

 

Why should I have to perform textual exegesis? 

Why,

a) when I have so little time,

B) have so many more

critical things demanding my attention,

c) when I believe that it is

the company itself that should be doing it more

systematically, and

d) that their failure to do so

reflects bad faith and does not portend well

for the products future?

 

It does not portend well, unless the ecosystem of

other, third-party companies/products that complement

Evernote, help Evernote, the company, to catch up with 

where solutions are really needed.

 

 

 

  • Like 1

Share this post


Link to post

×
×
  • Create New...