Jump to content

(Archived) Exact phrase searches in either web or Mac clients


Recommended Posts

Posted

I was recently frustrated when trying to find a code snippet that I had saved to evernote. I knew that the note should contain the exact phrase "End If", and knew that would narrow down the search, but didn't remember much else about it.

I tried to surround the search phrase in quotation marks, but this didn't work (Mac client), so I repeated the same search in the web client and that produced no better results. I googled around for 10minutes and found people saying that it should work, but eventually I logged into a Windows PC and tried the search in the Windows client and IT WORKED.

Just wondering guys, when do you intend to bring this sort of "exact phrase" search capability to either your web or Mac clients? It's a pretty important feature.

Regards, Michael

Posted

I just tested phrase search in Web and Windows client using All Notes search for "american television".

I got expected and identical results form both clients.

Hmmm. Was the phrase you were searching for in an attached text file?

  • Level 5
Posted

EN is constantly trying to improve cross-platform consistency but that is easier said than done. EN has now more than 12 different platforms and it is inevitable that not all of them are functioning the same way.

The search problem you are mentioning in your post IS annoying, I agree. I think EN devs are working on it.

Wern

Posted

Owyn, the problem is that a search with quotation marks should RESTRICT the results - and only show results where the two words occurr in the exact order specified, and in immediate proximity to each other.

In the specific case I mentioned - the words "End" and "If" occur in thousands of my notes, but the Evernote PC client is the only client that correctly returns only a handful of notes containing the phrase "End If", rather than thousands of useless results containing the word "end" somewhere and the word "if" somewhere completely else.

Hey EN devs - care to comment? This is pretty core functionality...

(I don't care if all platforms are not identical, but please give us assurance that at least one platform other than PC will soon be producing correct behaviour for this type of basic search.)

Posted

Hmmm. I did test a quoted (phrase) search. Word search would have produced a lot more hits.

Did a bit more testing using short words. Web client is definitely producing a lot more hits.

Hmmm. Something I read about dictionary pruning in Lucerne... Need to check. Soon...

Posted

Right. Lucene, not, Lucerne.

Finally, we perform a fairly elaborate pipeline of transformations on both note text and search expressions in order to normalize the representation of the text for correct comparisons. We have Lucene analysis filters that operate on the sequence of tokens to:
  • Remove apostrophes and other intra-word punctuation
  • Convert upper-case letters to lower case
  • Remove English “stop words” like “the” and “and”
  • Normalize letters with diacritics so that “ñ” becomes “n”
  • Convert “narrow width” Japanese characters to “full width”
  • Reorganize Chinese/Japanese/Korean text into pseudo “words”

Overall, we’ve been happy with the power and flexibility of Lucene. We have found, however, that it has become the most expensive software component in our shard infrastructure. On a busy shard, Lucenemakes twice as many IO operations as MySQL, and those operations are less sequential. This means that it’s the top priority for future software optimizations, and also for future hardware tuning to ensure that our shards continue to scale well as we grow.

http://blog.evernote.com/tech/2011/08/25/lucene-explainin/

  • Level 5*
Posted

Did a bit more testing using short words. Web client is definitely producing a lot more hits.

Hmmm. Something I read about dictionary pruning in Lucerne... Need to check. Soon...

Even if it is a limitation of the Lucene technology, shouldn't it work the same across all platforms, or at least the major platforms of EN Web, EN Win, and EN Mac?

Posted

Did a bit more testing using short words. Web client is definitely producing a lot more hits.

Hmmm. Something I read about dictionary pruning in Lucerne... Need to check. Soon...

Even if it is a limitation of the Lucene technology, shouldn't it work the same across all platforms, or at least the major platforms of EN Web, EN Win, and EN Mac?

Yeah. This is a definite search inconsistency.

I am opening a support request for a full list of the "english stop words".

Posted

From Evernote Support

For your reference, here is the full list of "stop words" that should be working across all platforms:

"a", "an", "and", "are", "as", "at", "be", "but", "by","for", "if", "in", "into", "is", "it","no", "not", "of", "on", "or", "such","that", "the", "their", "then", "there", "these","they", "this", "to", "was", "will", "with"

Posted

Windows client uses SQLite FTS3 to maintain its index (not Lucene), which has no native support for stop words. This is why the results on Windows are different than the server.

Posted

Yep. In this case I am more than willing to live with the inconsistency. Unlikely to use such a phrase in a saved search. Very likely to use it in an adhoc search.

  • Level 5*
Posted

Shouldn't the search for an exact phrase (in quotes) override the stop words?

The OP was searching for "End If".

Windows client uses SQLite FTS3 to maintain its index (not Lucene), which has no native support for stop words. This is why the results on Windows are different than the server.

What about EN Mac, does the database it uses have any stop words?

Posted

Because "if" is a stop word the phrase is reduced to "end".

If Mac and Web give same results for a reducable phrase then Mac must support stop words.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...