Gmail Down – The answer to Gmails downtime.

In the wake of the recent disruption to Twitter, Gmail today experienced a similar fate, with the service going down. Unlike Twitter, Gmail is a key source of communication and a primary email source for millions of users globally. For those Gmail users today has been a day of disruption with users unable to access any of Gmails functions.

So what happened?

A complete shut down of Gmail services should be avoidable as Google have thousands of mail servers that replicate emails and distribute workload from any servers that have encountered any issues. They also have request servers that route requests for email to the appropriate server. However, the issue today was Google were performing routine maintenance on the regular email servers, causing an overload to the request servers, causing a complete shutdown. Google underestimated the load these changes would place on the routers when it took a relatively small number of servers offline for upgrades.

The official Google Blog indicated;

“about 12:30 pm Pacific a few of the request routers became overloaded and in effect told the rest of the system “stop sending us traffic, we’re too slow!”. This transferred the load onto the remaining request routers, causing a few more of them to also become overloaded, and within minutes nearly all of the request routers were overloaded. As a result, people couldn’t access Gmail via the web interface because their requests couldn’t be routed to a Gmail server. IMAP/POP access and mail processing continued to work normally because these requests don’t use the same routers.”

The shut down caused an online panic,with twitter recorded 20,000 “Gmail Down” results in only the first few minutes of the discovery.

Back to normal, for now?

Full service has now resumed and Google has now increased its request router capacity well beyond peak demand. Although 2 hours of downtime is not the end of the world, it does highlight that Gmail could have potential flaws and with its user base expanding, could more underestimations cause a repeat of today. Google are now under pressure to ensure that such events do not happen again.

Advertisements
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: