Jump to content

Scanning in documents and database size


tavor

Recommended Posts

  • Level 5*

I currently have ~500 notes and the .exb file size is ~115mb. 

 

I've used the document camera for the occasional document scan. I'm contemplating pushing a bit more into going paperless and using a scanner to get a lot more paperwork into EN.

 

The one thing holding me back are the potential issues associated with database size - program responsiveness, slow syncing, etc.

 

For those who do a lot of scanning of documents, do you find it bloats the exb file and does that cause you any issues?

 

Any tips on economical scanning - i.e., scanning (with associated EN OCR benefit) but minimizing the associated EN file size bloat?

Link to comment

I currently have ~500 notes and the .exb file size is ~115mb.

I've used the document camera for the occasional document scan. I'm contemplating pushing a bit more into going paperless and using a scanner to get a lot more paperwork into EN.

The one thing holding me back are the potential issues associated with database size - program responsiveness, slow syncing, etc.

For those who do a lot of scanning of documents, do you find it bloats the exb file and does that cause you any issues?

Any tips on economical scanning - i.e., scanning (with associated EN OCR benefit) but minimizing the associated EN file size bloat?

I do know after a certain point, Evernote becomes less and less responsive. IDK where that point is. I also don't know if it's number of notes or database size, but I'd guess it's a combination of both. FWIW, here's my current setup. My working/duet account currently has ~3,000 notes & functions well on my Windows desktop, iPad 2 & iPhome 4.

http://discussion.evernote.com/topic/53078-free-account-in-addition-to-premium-account/

Grumpymonkey has researched this issue more than I have, so he may be able to provide more information.

Link to comment
  • Level 5

I hit the "wall" when I upgraded from version 4.7 to 5.0 on my Evernote Windows client in December. I was approaching 30,000 notes (6GB exb file). Many scalability issues arose - speed, crashing, not responding. So I bought a second premium Evernote account and started transferring older notes, but ran into a nasty unsolvable problem with my tags (hierarchical parent / child format) changing their name and promoted to the parent level. I am now moving my important notes (mainly scanned PDF documents) over to OneDrive.

Link to comment
  • Level 5*

17,000 notes and 12GB exb size and no major issues here - things do run faster when you have a limited number of notes,  but once it becomes a problem there's housecleaning you can do to keep things within bounds.  Free accounts are (sorry) freely available so you can always hive off some notes into archive or library storage if you need to,  and I'm a bit of a PC efficiency geek so my hard drive gets housecleaning every so often to remove bloat and other irrelevancies which may help.

 

I scan to folder and OCR my own files with Adobe 9.0 and have found settings which seem to do wonders for file size. I believe it's to do with the fact that the scans are essentially pictures,  but once OCR'd the 300dpi pics are replaced wherever possible by ASCII characters,  which occupy much less space.  A 6MB PDF scan will typically wind up at 1MB or below after OCR.

 

Personally I follow the paper-hating theory that if it folds,  stick it in the scanner;  if its too big or too stiff,  take a picture.  And if you possibly can after that,  bin the original.  I've converted a one-room research and reference library into a portable hard drive (hence the 17k notes) and I'm currently fighting off these bits of paper that 1) are too obscure or complicated to just dump or 2) people out there keep on sending me,  darnit!

 

Best advice about scanning / referencing / OCR etc is (to misquote the great GTD Guru,  Yoda)  "There is no try,  there is no plan.  Start doing!"

 

-You're probably going to find a few 'better' ways to file your stuff once you've done the first 100 scans.  Take a deep breath and rename / retag / renoteboook them as required.  You might have to do that a couple of times.  But you'll quickly get a rhythm going that works.  Then you'll begin to see that mythical 'empty desk space' that we all dream about...

Link to comment
  • Level 5*

The one thing holding me back are the potential issues associated with database size - program responsiveness, slow syncing, etc.

 

This is clearly a problem that Evernote MUST solve if they really expect to be the "100 Year Company" that they claim to be. 

 

Evernote markets their product as your "external brain" to store all your stuff for life in.

So, clearly Evernote MUST be scalable or it will ultimately fail.  If I can't store all my stuff in one Evernote database, or at least all available for Search from one Evernote Account/Client, and Search EVERYTHING quickly, then it will cease to be a viable solution for me, and I suspect many others.

 

As more and more long-time Evernote users like JBenson2 run into the same limitations/roadblocks, Evernote will receive such bad press that people will no longer want to begin a long journey using Evernote.

Link to comment
  • Level 5*

I currently have ~500 notes and the .exb file size is ~115mb. 

 

I've used the document camera for the occasional document scan. I'm contemplating pushing a bit more into going paperless and using a scanner to get a lot more paperwork into EN.

 

The one thing holding me back are the potential issues associated with database size - program responsiveness, slow syncing, etc.

 

For those who do a lot of scanning of documents, do you find it bloats the exb file and does that cause you any issues?

 

Any tips on economical scanning - i.e., scanning (with associated EN OCR benefit) but minimizing the associated EN file size bloat?

 

I don't think I have an ideal solution, but it works for me.

http://www.christopher-mayo.com/?p=127

 

To meet the needs of my use case, there are a couple of things Evernote needs to do:

 

(1) Provide us with the ability to have encrypted notebooks.

(2) Provide us with the ability to have selective sync (as we have on mobile)

 

#1 addresses the problem of having your stuff in multiple locations (private/sensitive stuff in a local notebook or outside of Evernote, and everything else in Evernote). This might seem unrelated, but from my perspective, if I have to have one important document outside of Evernote, I might as well have all of them (many thousands) outside of it. Sure, I could encrypt each of them one by one, but I won't, because there isn't much point (see #). I talk about my concerns and solutions a bit more here.

http://www.christopher-mayo.com/?p=1605

 

#2 addresses the specific issues you raised. I think Evernote has improved a lot in recent months (on the Mac), but there is still room for improvement. I am occasionally left in the lurch by mysterious beachballs (a Mac thing that tells you it is thinking). Processing can still take a lot of time for major changes like a tag name that affects hundreds or thousands of notes. However, things are getting better. 

 

Frankly speaking, if we had selective sync, I probably would never experience lag again, because I have very little available space on my Macbook Air and I'd have it all on the cloud. The app is ready for some large databases now, but on the backend we still need to get to the point where we can have selective sync.

 

My needs may simply not fit in with Evernote's plans, and that is cool with me, because I think Evernote obviously offers a lot to people even without #1 and #2. I don't expect them to cater to everyone's use case. Personally, I think these features are critical ones that are long overdue...but, that is just my personal (and very selfish) opinion :)

Link to comment
  • 2 months later...

I thought I'd add my 2 cents. I found this thread because I ran across an article about manually backing up your Evernote database. (Not a bad idea. I highly doubt Evernote would ever have a catostrophic failure and premium users can roll back to previous note versions thus protecting me from becoming my own worst enemy).  I noticed that my database file (EXB) is 13 GB with just over 4000 notes. So I got curious as to how other users compared in usage. From reading several threads I think I'm on the higher end, but I would bet there are some extraordinary cases out there.  After reading a couple threads, I have a couple observations:

 

- I have had zero performance issues. Search is remarkably fast. My computer is going on 5 years old. Nothing crazy

- I have a feeling I could cut my database down by 1/4 or even cut it in half if I went through all my PDF's and let acrobat 9 work it's magic. I'm blown away what it can do with file size with little or no discernible difference in quality. ("Optimize scanned PDF" option is some sort of sorcery)

- Evernote should let users check an option to do a "super optimization" of our docs.  It could be a premium user feature and allow rollback to old versions if the results were unsatisfactory, but judging from what Acrobat can do, I think that would be rare. They could probably even do some cool stuff with photos. In my line of work, we take thousands upon thousands of before and after photos of work completed. Our photos just need to be web quality...very small.  Most cameras are overkill.  It would be nice to load'em up in EN and have it compress them. I guess if EN is keeping the "rollback" version it wouldn't help them save space, but it would help a lot of users save space and bandwidth, especially if selective syncing on the desktop becomes an option. 

Link to comment
  • Level 5*

I thought I'd add my 2 cents. I found this thread because I ran across an article about manually backing up your Evernote database. (Not a bad idea. I highly doubt Evernote would ever have a catostrophic failure and premium users can roll back to previous note versions thus protecting me from becoming my own worst enemy).  I noticed that my database file (EXB) is 13 GB with just over 4000 notes. So I got curious as to how other users compared in usage. From reading several threads I think I'm on the higher end, but I would bet there are some extraordinary cases out there.  After reading a couple threads, I have a couple observations:

 

- I have had zero performance issues. Search is remarkably fast. My computer is going on 5 years old. Nothing crazy

- I have a feeling I could cut my database down by 1/4 or even cut it in half if I went through all my PDF's and let acrobat 9 work it's magic. I'm blown away what it can do with file size with little or no discernible difference in quality. ("Optimize scanned PDF" option is some sort of sorcery)

- Evernote should let users check an option to do a "super optimization" of our docs.  It could be a premium user feature and allow rollback to old versions if the results were unsatisfactory, but judging from what Acrobat can do, I think that would be rare. They could probably even do some cool stuff with photos. In my line of work, we take thousands upon thousands of before and after photos of work completed. Our photos just need to be web quality...very small.  Most cameras are overkill.  It would be nice to load'em up in EN and have it compress them. I guess if EN is keeping the "rollback" version it wouldn't help them save space, but it would help a lot of users save space and bandwidth, especially if selective syncing on the desktop becomes an option. 

 

Thanks for commenting.  Nice to see a good experience - I guess them that's happy tend not to bother sharing (present company,  obviously,  excepted),  while those who aren't need to vent more.  I'd only take you up on one point - the "not a bad idea" backup.  I have this insanely paranoid view that my data is mine,  darnit.  (You can imagine the Gollum voice here if you'd like..  "my precious data...")  No-one is going to take totally good care of it except me.   So I have backups.  In case I find a corrupted but vital note in my current setup that turns out to have been synced around all the devices I use,  and not retrievable by stepping the content back,  because the actual content is bad in some way.  It's never happened,  and it might never.  But it takes a few minutes to copy my EXB file to an external drive and I'm proof against one more imaginary disaster...

 

Of course if you backup your desktop,  you probably backup the data file automatically - but still one more eensy weensy 13GB backup...

 

Entirely your choice,  and I may be over-reacting just a tad.  But million to one chances do happen!

Link to comment
  • Level 5*

To the OP, I have ~25,000 notes in a ~14GB data base with 8 notebooks.  I have an SSD and 8GB of memory on my PC, a couple of years old.  I only notice a lag, 5 seconds or so, when I select a context which has a large number of notes, either all notes or the biggest notebook.  Notebooks with less than 4000 notes don't have a lag at all.  And the searches once the context is presented don't have a lag either.  Don't know that you can apply this to all users but it is my experience.   Also, not scientific, but the lag appeared with V5, which is probably 5000 notes ago.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...