Jump to content

(Archived) Exact phrase searches in either web or Mac clients


Recommended Posts

I was recently frustrated when trying to find a code snippet that I had saved to evernote. I knew that the note should contain the exact phrase "End If", and knew that would narrow down the search, but didn't remember much else about it.

I tried to surround the search phrase in quotation marks, but this didn't work (Mac client), so I repeated the same search in the web client and that produced no better results. I googled around for 10minutes and found people saying that it should work, but eventually I logged into a Windows PC and tried the search in the Windows client and IT WORKED.

Just wondering guys, when do you intend to bring this sort of "exact phrase" search capability to either your web or Mac clients? It's a pretty important feature.

Regards, Michael

Link to comment

I just tested phrase search in Web and Windows client using All Notes search for "american television".

I got expected and identical results form both clients.

Hmmm. Was the phrase you were searching for in an attached text file?

Link to comment
  • Level 5

EN is constantly trying to improve cross-platform consistency but that is easier said than done. EN has now more than 12 different platforms and it is inevitable that not all of them are functioning the same way.

The search problem you are mentioning in your post IS annoying, I agree. I think EN devs are working on it.

Wern

Link to comment

Owyn, the problem is that a search with quotation marks should RESTRICT the results - and only show results where the two words occurr in the exact order specified, and in immediate proximity to each other.

In the specific case I mentioned - the words "End" and "If" occur in thousands of my notes, but the Evernote PC client is the only client that correctly returns only a handful of notes containing the phrase "End If", rather than thousands of useless results containing the word "end" somewhere and the word "if" somewhere completely else.

Hey EN devs - care to comment? This is pretty core functionality...

(I don't care if all platforms are not identical, but please give us assurance that at least one platform other than PC will soon be producing correct behaviour for this type of basic search.)

Link to comment

Hmmm. I did test a quoted (phrase) search. Word search would have produced a lot more hits.

Did a bit more testing using short words. Web client is definitely producing a lot more hits.

Hmmm. Something I read about dictionary pruning in Lucerne... Need to check. Soon...

Link to comment

Right. Lucene, not, Lucerne.

Finally, we perform a fairly elaborate pipeline of transformations on both note text and search expressions in order to normalize the representation of the text for correct comparisons. We have Lucene analysis filters that operate on the sequence of tokens to:
  • Remove apostrophes and other intra-word punctuation
  • Convert upper-case letters to lower case
  • Remove English “stop words” like “the” and “and”
  • Normalize letters with diacritics so that “ñ” becomes “n”
  • Convert “narrow width” Japanese characters to “full width”
  • Reorganize Chinese/Japanese/Korean text into pseudo “words”

Overall, we’ve been happy with the power and flexibility of Lucene. We have found, however, that it has become the most expensive software component in our shard infrastructure. On a busy shard, Lucenemakes twice as many IO operations as MySQL, and those operations are less sequential. This means that it’s the top priority for future software optimizations, and also for future hardware tuning to ensure that our shards continue to scale well as we grow.

http://blog.evernote.com/tech/2011/08/25/lucene-explainin/

Link to comment
  • Level 5*

Did a bit more testing using short words. Web client is definitely producing a lot more hits.

Hmmm. Something I read about dictionary pruning in Lucerne... Need to check. Soon...

Even if it is a limitation of the Lucene technology, shouldn't it work the same across all platforms, or at least the major platforms of EN Web, EN Win, and EN Mac?

Link to comment

Did a bit more testing using short words. Web client is definitely producing a lot more hits.

Hmmm. Something I read about dictionary pruning in Lucerne... Need to check. Soon...

Even if it is a limitation of the Lucene technology, shouldn't it work the same across all platforms, or at least the major platforms of EN Web, EN Win, and EN Mac?

Yeah. This is a definite search inconsistency.

I am opening a support request for a full list of the "english stop words".

Link to comment

From Evernote Support

For your reference, here is the full list of "stop words" that should be working across all platforms:

"a", "an", "and", "are", "as", "at", "be", "but", "by","for", "if", "in", "into", "is", "it","no", "not", "of", "on", "or", "such","that", "the", "their", "then", "there", "these","they", "this", "to", "was", "will", "with"

Link to comment

Windows client uses SQLite FTS3 to maintain its index (not Lucene), which has no native support for stop words. This is why the results on Windows are different than the server.

Link to comment
  • Level 5*

Shouldn't the search for an exact phrase (in quotes) override the stop words?

The OP was searching for "End If".

Windows client uses SQLite FTS3 to maintain its index (not Lucene), which has no native support for stop words. This is why the results on Windows are different than the server.

What about EN Mac, does the database it uses have any stop words?

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...