Friday, October 26, 2012

And... we're back!

Friday, Oct 26th, 2pm:
EssayTagger.com is back up and seems to be responding normally. See my earlier post and its live updates during the morning's downtime.

What happened?
Google App Engine suffered a worldwide outage around 9:30am (CST) Friday after which they slowly restored services. The outage knocked out major sites like Dropbox, Instagram, Khan Academy, and anyone else running on Google's infrastructure. During most of this outage EssayTagger.com was either inaccessible or experienced excruciatingly slow load times. The site reached stability around 2pm.

Is this normal?
Nope. An outage of this scale is unprecedented. Most tech folks view Google's infrastructure as being as robust and as close to invulnerable as you can get and their track history had borne that out, until this morning.

Are you going to drop Google App Engine now?
For the moment, no. This was an aberration. The realities of website hosting are that downtime happens, no matter which infrastructure you're running on. And, to be honest, I have much more faith in Google's engineers than I do in anyone else -- including myself. Yes, their system failed this morning, but in the brief 11 months of EssayTagger's life, Google App Engine has been remarkably stable and much more reliable than anything else out there.

Read Google's mea culpa, their analysis of what happened, and the new preventative measures they've put in place:

"We know you rely on App Engine to create applications that are easy to develop and manage without having to worry about downtime. App Engine is not supposed to go down, and our engineers work diligently to ensure that it doesn’t. [...] We know that hundreds of thousands of developers rely on App Engine to provide a stable, scalable infrastructure for their applications, and we will continue to improve our systems and processes to live up to this expectation."