krisf 1 Posted April 11, 2019 Share Posted April 11, 2019 Hello, since search results are poisened by false positives caused by crappy OCR, I want to disable OCR in order to get more accurate searches. Is there a way to do this? Thanks Link to comment
Level 5* DTLow 5,736 Posted April 11, 2019 Level 5* Share Posted April 11, 2019 This would be for text search purposes. I find it useful to limit my searches to the note title using the intitle:aaaaaaaa feature Link to comment
krisf 1 Posted April 11, 2019 Author Share Posted April 11, 2019 of course, One cannot put all relevant key words in the title. Link to comment
Level 5 PinkElephant 8,075 Posted April 11, 2019 Level 5 Share Posted April 11, 2019 Which OCR you regard as crappy: The one created before a file (pdf ?) was loaded into EN, or the OCR done by EN on its servers ? This are two completely separate ways of OCRing, and if the problems arise in one process, it makes no sense to fix the other. Up to my knowledge, if a document was OCRed before it was loaded into EN, it will not be OCRed again on the servers. So if you do your own OCR before you create the note in EN, maybe switch this OCR off and give EN a try to OCR these files on the server. You can create 2 PDFs from the same stack of paper, one OCRed locally and one by EN, do this for 10-20 documents you typically scan, and check what works better by trying it out. From my experience, if the scan is „clean“ regarding quality and language set, the server based OCR does a very good job. You need a pretty good local OCR Platform to match it. Link to comment
krisf 1 Posted April 12, 2019 Author Share Posted April 12, 2019 The thing is, OCR will never be 100% free of errors. So I just don't want to include it in my index. But Apparently this is not possible in EverNote, yet. Oh well Link to comment
Level 5 Dave-in-Decatur 3,938 Posted April 12, 2019 Level 5 Share Posted April 12, 2019 The only way to do this, I think, would be to drop back from Premium to Basic, since OCR is a Premium feature. But then of course you would lose Premium features that you might desire to keep. Link to comment
Level 5* gazumped 11,652 Posted April 15, 2019 Level 5* Share Posted April 15, 2019 On 4/11/2019 at 3:52 PM, krisf said: One cannot put all relevant key words in the title. I operate on the basis of 'smart' titles and searches. The titles include the date, type, source, and (some) keywords, and I refine search results by editing titles, adding tags and saving 'exact' searches (that include the results I need, and only the results I need) where necessary. I've worked in various industries that use BIG databases, and the abiding lesson is: there's no 'perfect' index. No matter how carefully you manage the content, a combination of entry errors and omissions and the mulish variety of content mean that you'll always have too many, or too few results. Using a database means being familiar with the search grammar so you can ask very specific questions, and tweaking the content frequently so you get good answers. I do have searches that start out general, and then exclude keywords, iteration by iteration, until I get to an acceptable level of accuracy - but then I add a new tag to those notes that will find them for the future. (My tag list lives in Workflowy as well as Evernote so I can view and review it in a more user-friendly format than Evernote currently allows.) My data may be unusually specific, but I do have getting on for 46k notes, and finding all my stuff is still relatively easy.... Link to comment
krisf 1 Posted April 15, 2019 Author Share Posted April 15, 2019 'smart titles' imply that you know in advance what search words you are gonna use. When I'm composing a note, I don't want to think about how I want that note to be findable. That's something the search engine should take care of for me. You need to use 'smart titles' cause the EverNote search engine is not smart. Link to comment
Level 5* gazumped 11,652 Posted April 15, 2019 Level 5* Share Posted April 15, 2019 42 minutes ago, krisf said: 'smart titles' imply that you know in advance what search words you are gonna use. I don't know what sort of content you're saving, but mine is emails / correspondence / project stuff / photo locations / tech tips / receipts - pretty varied. I search for things like receipts, and project stuff - so 'everything within <these dates> that has "receipt" in the title' will get me a broad cross section; that, plus 'Macdonalds in the title' gets me my junk food exposure. 'Everything tagged <projectnumber>' gets me a list of content I can pick from, or select from more closely. YMMV, and I'm just sayin' - but Evernote's search works well for me. - The searches above obviously look rather different in Evernote grammar; I'm just rendering the intent of each one. Link to comment
SMCB 2 Posted April 15, 2019 Share Posted April 15, 2019 That’s a great idea. Thanks so much for your help! Link to comment
Level 5* CalS 5,280 Posted April 15, 2019 Level 5* Share Posted April 15, 2019 On 4/11/2019 at 1:58 AM, krisf said: Hello, since search results are poisened by false positives caused by crappy OCR, I want to disable OCR in order to get more accurate searches. Is there a way to do this? Thanks Per the above, not specifically. Workaround, if there is a particularly problematic document type, you can minus that type out of your search results. For example -resource:image/png will eliminate any notes with png images from the search results (use jpg for jpegs). These two document types if any are the ones that will fail OCR in my use case. You can use a text expander to hot key it into your search. Of course this will eliminate any notes with the search term in the text of the note that contain an image with the term. Link to comment
krisf 1 Posted April 16, 2019 Author Share Posted April 16, 2019 13 hours ago, CalS said: Per the above, not specifically. Workaround, if there is a particularly problematic document type, you can minus that type out of your search results. For example -resource:image/png will eliminate any notes with png images from the search results (use jpg for jpegs). These two document types if any are the ones that will fail OCR in my use case. You can use a text expander to hot key it into your search. Of course this will eliminate any notes with the search term in the text of the note that contain an image with the term. The -resource:image/* is indeed a usable workaround. Thanks for all your replies, everyone. Cheers Link to comment
Level 5* CalS 5,280 Posted April 16, 2019 Level 5* Share Posted April 16, 2019 You are welcome. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.