Improving search function

galaxywarrior · August 5, 2018

Hello there, I've been using Evernote for two years and I really enjoy the app! Makes studying and reviewing notes a lot more convenient. Thank you!

I would like to provide a suggestion about the searching function. Most of the time I only remember the title of the note, so if I want to search for a note, I usually type the title in the search box. However, I may have to scroll through 20 notes to find the note that I'm looking for because every note that includes the term pops up in the search. This process is a little frustrating. It would be amazing if you could program Evernote so that the when I search for a term, the note title that includes the search term shows up at the very top.

Thanks again.

gazumped · August 6, 2018

Hi. Try using "intitle:<keyword>" (without quotes) to see results from titles only. If you have multiple search terms, use "intitle:<keyword1> intitle:<keyword2>"

How to use Evernote's advanced search syntax

jefito · August 6, 2018

19 hours ago, galaxywarrior said:

It would be amazing if you could program Evernote so that the when I search for a term, the note title that includes the search term shows up at the very top.

This is generally not how Evernote sorts any result list. It finds the notes that match your search, and then applies the current sort criteria (e.g. by note title, created time, updated time, note author, size, etc.). It's a single sort only; there are no separate buckets for breaking out subclasses of the result set. You can, @gazumped pointed out, break this particular search out on your own using 'intitle:'.

Elliot On Evernote · August 7, 2018

Thanks for posting the question. I also love Evernote but have generally had a hard time with searching for notes. The search results are always displayed in a less relevant order than I'm expecting. I kind of wish it was more similar to google search. I think I need to get better at Evernote search because even if everything is possible, it doesn't seem as intuitive.

gazumped · August 7, 2018

13 hours ago, Elliot On Evernote said:

even if everything is possible, it doesn't seem as intuitive.

Have to agree with you there - but things work well if you're prepared to learn the rights terms to use. I got trained in how to search databases long before Evernote came along, and the way things work here seems pretty standard and works well for me.

jefito · August 7, 2018

13 minutes ago, gazumped said:

I got trained in how to search databases long before Evernote came along, and the way things work here seems pretty standard and works well for me.

Except in database query languages like SQL, you can specify the search and the sort in the same query. For Evernote, sorting is not specifiable in the search language, and it's largely a global setting.

DTLow · August 7, 2018

14 hours ago, Elliot On Evernote said:

The search results are always displayed in a less relevant order than I'm expecting.

The Mac platform actually has a "relevant" sequence, based on some algorithm that may or may not match what you or I are expecting.
Actually I may have different expectations than you.

I'm more comfortable with a defined sequence.

Don Dz · August 7, 2018

15 hours ago, Elliot On Evernote said:

I think I need to get better at Evernote search because even if everything is possible, it doesn't seem as intuitive.

Personally for me at least, the most important thing I ever learned about Evernote should have been the very first thing mentioned in help info, since search is the main marketed and selling point:

Evernote does not search for strings of characters in random locations, they must always be at the beginning of a word, otherwise they will not be found (inside of notes at least, tags have no such limitation, weird).

I did not realize this till well invested in the system. Had to do a lot of re-editing to make sure my notes could be found, based on keywords and tags in particular.

Now I do not move a note out of my Inboxes till this issue is properly clarified, on every new note, annoying but necessary.

rezecib · August 7, 2018

@galaxywarrior search relevance is something we're actively working on. As pointed out by others, the Mac client has relevance, but it should be coming to the beta web client too. It will be a little fancier than just prioritizing titles, but that's definitely a part of it.

2 hours ago, Don Dz said:

Evernote does not search for strings of characters in random locations, they must always be at the beginning of a word, otherwise they will not be found (inside of notes at least, tags have no such limitation, weird).

As far as I know, any case where you can do infix search (like "*term*") is handled client-side. So like the client has already acquired the list of tags you have and may do a local infix search through them to find the tag, and then actually search by the tag's guid.

I would like to be able to support infix search, but when you're dealing with a large corpus of data (and searching across all note content in an account can definitely count there), infix search has pretty serious scalability issues. Like it's usually fast to do an arbitrary regex on a single document, but 10,000 of them and it starts to be super slow.

In a scalable search system (i.e. one with an index), the way a search works is:

Take the user input and break it up into tokens, joined by some operator (in our case, AND)
If a token isn't set to a particular field (like "title", "tag", etc), set it to the "all" field
If a token has wildcards, check the term dictionary for that field to find terms that match, and expand the token into a search for each of those joined by "OR"
Now find the list of documents matching each of those terms by looking in each term's index
Join the list of documents and rank them (such as by term frequency)
Return the ranked list of documents

For the expanding the token wildcard part, a final wildcard is pretty efficient-- you can binary-search to efficiently find the region of tokens that will match (or find that there are none), and then just scan from there until you've got the whole clump of them, because they will all be together. A leading wildcard, however, doesn't guarantee that they'll be together, so you have to scan the whole dictionary. (I have some crazy ideas about how to make this more efficient, but suffice it to say that it's annoying/expensive enough that you'd really need solid ground that people would use it). If you have both wildcards, then... that's an even more complicated problem to solve.

Because final wildcards are so easy, and it's pretty common to want that sort of expansion, we add those automatically. So searching "hello" will be automatically converted to "hello*".

jefito · August 8, 2018

2 hours ago, rezecib said:

search relevance is something we're actively working on. As pointed out by others, the Mac client has relevance, but it should be coming to the beta web client too. It will be a little fancier than just prioritizing titles, but that's definitely a part of it.

Wondering whether there will be a way to tune this per user, or if it's all going to be automagical...

Nice explanation on the leading wildcard situation. Thanks for that. Can I take it as a given that each note is indexed by breaking it into tokens and storing them in a sorted list (presumably with location information for each instance); binary search doesn't make sense otherwise. Maybe join the individual note indexes into larger aggregations? Not exactly my area of expertise. In any case, it still would be handy to have leading *, and I'd guess that there are a lot of people who wouldn't mind some slowdown to have it. Or maybe a special search term (e.g., "full:") to indicate that that's desired? Though I suppose presence of an infix term might do, though that vould have surprise factor for an unsuspecting user...

Don Dz · August 8, 2018

1 hour ago, rezecib said:

I would like to be able to support infix search, but when you're dealing with a large corpus of data (and searching across all note content in an account can definitely count there), infix search has pretty serious scalability issues. Like it's usually fast to do an arbitrary regex on a single document, but 10,000 of them and it starts to be super slow.

I definitely understand, from time to time I have observed some operations where Evernote gets stuck, then eventually I realize it's not frozen, just doing something massive.

But couldn't infix search be offered as a separate command, with a warning like "this operation can take a long time, consider trying regular search", or something like that?

I would certainly be willing to take my chances just to be sure I did not misplace my data, I can afford to wait as long as it takes for a long search, from time to time, I am sure others feel the same way.

rezecib · August 8, 2018

17 hours ago, jefito said:

Can I take it as a given that each note is indexed by breaking it into tokens and storing them in a sorted list (presumably with location information for each instance); binary search doesn't make sense otherwise.

I glossed over a little of the complexity there-- technically the terms dictionary is stored as a prefix-tree of blocks, where blocks are a fixed size. So It can navigate the prefix tree a little faster than a binary search, but terms that share that prefix will all be in the same block.

But there are actually two separate structures at play there, there's the term dictionary and then there are inverted indexes for each term. But yeah, a note is indexed by setting up fields, the main text ones being tokenized and then inserted into the dictionary and put into the inverted indexes for those terms.

17 hours ago, Don Dz said:

But couldn't infix search be offered as a separate command, with a warning like "this operation can take a long time, consider trying regular search", or something like that?

That makes sense for a local search, but becomes a bit more dubious for server-side search (that's what I work on), because expensive queries don't just affect you. So the natural thing that occurs is "well, let's prevent you from doing it too much"-- this touches on rate limiting and fairness, which are surprisingly complicated in a distributed system (they're not unsolvable, but they're tricky enough that you can mess them up pretty easily). Local searches are a bit of a hairy beast as well, but because the different clients have separate local search implementations, much to my frustration. So... in both cases yes, it could be done, but it's not as easy as it should be.

Don Dz · August 8, 2018

4 minutes ago, rezecib said:

That makes sense for a local search, but becomes a bit more dubious for server-side search

I understand, local search is all I was contemplating.

jefito · August 8, 2018

49 minutes ago, rezecib said:

I glossed over a little of the complexity there-- technically the terms dictionary is stored as a prefix-tree of blocks, where blocks are a fixed size. So It can navigate the prefix tree a little faster than a binary search, but terms that share that prefix will all be in the same block.

But there are actually two separate structures at play there, there's the term dictionary and then there are inverted indexes for each term. But yeah, a note is indexed by setting up fields, the main text ones being tokenized and then inserted into the dictionary and put into the inverted indexes for those terms.

No problem simplifying things here. I'm sure your boss would prefer that you not spend a lot of time writing books in the forums.

But yeah, I was kinda aware that there was something like that under the hood, because of some lag after typing several characters, while the search results catch up, in the Windows client (which doesn't seem to be so bad these days, maybe due to better hardware, database on the SSD, or one of those mysterious settings or something). That and it makes sense. A quick, relatively memory spare, lookup to partition the notes, then roll in the heavier guns on the smaller result set. I've implemented stuff like that before, in the small, for tokenization in parsers where the lookup was just the first character, but it's not a big stretch to expand that to say, 4 characters.

Anyhow, I appreciate the insights. Thanks for taking the time...

Improving search function

Idea

galaxywarrior 0

Link to comment

13 replies to this idea

Recommended Posts

gazumped 11,698

Link to comment

jefito 5,589

Link to comment

Elliot On Evernote 0

Link to comment

gazumped 11,698

Link to comment

jefito 5,589

Link to comment

DTLow 5,736

Link to comment

Don Dz 165

Link to comment

rezecib 98

Link to comment

jefito 5,589

Link to comment

Don Dz 165

Link to comment

rezecib 98

Link to comment

Don Dz 165

Link to comment

jefito 5,589

Link to comment

Archived

Community Resources