Jump to content

(Archived) Disruption of Evernote Service


Recommended Posts

Is anyone experiencing a disruption of Evernote? I've intermittently gotten the "Shh the elephant is asleep" we are performing maintenance window.

My local clients won't sync. My iPhone won't sync. At this time it's working, but it's been on and off (mostly off) all morning). Trying from different platforms and locations from different networks.

Feedback please?

Thanks

Link to comment

This morning I was having connection trouble with both the web interface ("the Elephant is sleeping") and my local client (wouldn't sync). Sometime after noon U.S. Eastern time, I was able to access the web interface, and then sync shortly thereafter. The Service Status page said it was up every time I checked, though, so I don't know how reliable that is.

Link to comment

I've only been able to sync 2 out of a number of times today.

The log shows:

Client name: Evernote Mac/55604 (en-US); MacOS/10.5.8;

WebKit runtime: 531.9.0

WebCore runtime: 531.9.0

Safari clipper plugin version is 55537

NSExceptionHandler has recorded the following exception:

RSSParseFailed -- The XML parser could not parse the RSS data.

This is followed by a stack trace.

I do see the advertisements updating though!

Link to comment

We apologize for the inconvenience that today's outage caused you. Here's the details of today's problem:

Evernote stores all of the data for groups of users on independent servers that we refer to as "shards". Each shard is made up of two separate boxes that have internal redundancy for all data (mirrored on pairs of disks). Your data is also continually replicated from the primary box in each shard to its peer, so that we always have at least 4 separate copies of your current data, supplemented by nightly backups and weekly offsite data storage.

We maintain an automated monitoring system for these servers that continually checks them for problems and reports urgent problems to our 24/7 System Operations team via pagers within 60 seconds.

We're currently running 13 different shards for our 1.5 million registered users. Last night, at 1:53am PST, shard #3 (yours) experienced a drive subsystem failure that locked up all disk activity to its storage. Accounts on the other 12 shards were unaffected. This would be a fairly routine problem for an operation at our scale if not for a combination of two other factors:

(1) The secondary box in the shard pair was not successful in automatically taking over when the primary failed. We're investigating the reason for this failure, but it appears that the primary server was healthy enough to say "I'm OK", but not enough to provide correct service to our users.

(2) More importantly, our monitoring system did not send out pages to our operations staff when this problem started. This was indirectly caused by a software update we performed to the monitoring system on Wednesday. Even though we tested the monitoring upgrade for several hours on Wednesday, and it worked correctly to notify of problems on Thursday, it stopped performing the required checks some time on Thursday night. We are investigating this error, and reverting to the last working system until we can resolve the problem.

We apologize again for the inconvenience. We have worked hard over the last year to increase the redundancy of all parts of the Evernote service infrastructure, and to improve the systems monitoring and notification to increase Evernote's overall availability to around 99.7%, but we intend to continue to work to improve our processes and infrastructure to make sure that Evernote is something that you can always rely on.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...