Jump to content

Welcome! You're currently a Guest.

If you'd like to join in the Discussion, or access additional features in our forums, please sign in with your Evernote Account here. Have an Evernote Account but forgot your password? Reset it! Don't have an account yet? Create One! You'll need to set your Display Name before your first post.

Leigh Riffel

Member Since 11 Jun 2012
Offline Last Active Jun 13 2012 02:09 PM
-----

Topics I've Started

Search Algorithm Suggestion

11 June 2012 - 05:18 PM

I suggest that the Evernote search algorithm be tweaked in three ways.
  • Weight notes higher if the title of the note contains all the search terms so that notes whose titles do not contain any of the search terms do not come first.
  • Discount the weight of words based on the length of the note so that longer notes do not dominate search results.
  • Weight a document that contains the search terms in close proximity higher so that documents using the terms but having them scattered do not superceed the more likely candidates.

These suggestions are based on my day to day use of Evernote, but here is a specific example. When I search for "Change Password" (without quotes) in Evernote, here are some of the results in order.

1. A 833 page PDF that does not contain "Change Password", but contains "Change" 338x and password 55x.
2. A 116 line, 8005 character note that does not contain "Change Password", but contains "Change" 7x and "Password" 3x.
3. A 56 line, 1919 character note that does not contain "Change Password", but contains "Change" 2x and "Password" 1x.
4. A 1334 line, 126,619 character note that does not contain "Change Password", but contains "Change" 16x and "Password" 7x.
5. A 1651 line, 100,080 character note that does not contain "Change Password", but contains "Change" 14x and "Password" 31x.
6. A 1616 line, 95,046 character note that does not contain "Change Password", but contains "Change" 18x and "Password" 13x.
7. A 9 line 445 character note titled "Change Password" and contains "Change" 5x and "Password" 4x.
...
23. A 15 line, 564 character note titled "Change Database Computer Password" and containing "Change" 7x and "Password" 6x.

In my mind #7 and #23 should both come before 2-6. They are the only ones with both search terms in the title and although they are short they mention the search terms more frequently based on the length of the documents. Their search term proximity is also very high compared to the other documents.