MobileNews

Google fully explains what caused Monday’s multi-service outage

Google started the week with a big outage that took down Gmail, Drive, and all other Workspace apps. As promised, Google now has a detailed explanation on the outage and steps it will take to prevent future incidents.

At a high level, the issue relates to existing work updating Google’s account authentication system. As the effort was ongoing, previous components were “left in place.” While keeping those older aspects resulted in an error about usage being at 0, Google instituted a grace period to delay the impact. 

That remedial fix expired and led automated systems to respond to the error as if it were real. Since usage appeared to be at 0, capacity for the identity management system was scaled down. While safety checks were in place, they were not designed to cover the specific problem.

The issue started impacting users at 3:47 a.m. PT and engineers were alerted a minute later. “Workspace apps were down for the duration of the incident” since they rely on the impacted infrastructure to make sure you’re logged in, authenticated, and authorized to see content, like emails and documents.

At 04:08 the root cause and a potential fix were identified, which led to disabling the quota enforcement in one datacenter at 04:22. This quickly improved the situation, and at 04:27 the same mitigation was applied to all datacenters, which returned error rates to normal levels by 04:33.

The company laid out plans to review, improve, and evaluate its systems to prevent similar issues of this nature. Google ended its outage explanation with an apology:

We would like to apologize for the scope of impact that this incident had on our customers and their businesses. We take any incident that affects the availability and reliability of our customers extremely seriously, particularly incidents which span multiple regions.

The full technical explanation is available here.



Author: Abner Li
Source: 9TO5Google

Related posts
AI & RoboticsNews

Mike Verdu of Netflix Games leads new generative AI initiative

AI & RoboticsNews

Google just gave its AI access to Search, hours before OpenAI launched ChatGPT Search

AI & RoboticsNews

Runway goes 3D with new AI video camera controls for Gen-3 Alpha Turbo

DefenseNews

Why the Defense Department needs a chief economist

Sign up for our Newsletter and
stay informed!