Facepalm: One of Google Cloud's worst nightmares came true in early May, when an embarrassing snafu completely erased a customer's account and data backups. The unlucky victim was Australian pension fund UniSuper, which manages a staggering $135 billion in assets for over 600,000 members. The pension fund was essentially frozen for two weeks, unable to fully operate while it scrambled to recover from third-party backups.
The incident started on May 2 when UniSuper suddenly lost access to all of its data and services hosted on Google Cloud, including backups. Soon after, a joint statement by the two companies admitted that an "inadvertent misconfiguration" resulted in the deletion, but details were slim. UniSuper was only able to come back online on May 15 after completing a full restoration.
This week, Google detailed exactly what went wrong. Someone at the company accidentally left a parameter blank while provisioning UniSuper's private cloud services using an internal tool. That seemingly small mistake had the catastrophic consequence of marking UniSuper's account for automatic deletion after a fixed term.
Google has put up a TL;DR on the matter:
"During the initial deployment of a Google Cloud VMware Engine (GCVE) Private Cloud for the customer using an internal tool, there was an inadvertent misconfiguration of the GCVE service by Google operators due to leaving a parameter blank. This had the unintended and then unknown consequence of defaulting the customer's GCVE Private Cloud to a fixed term, with automatic deletion at the end of that period. The incident trigger and the downstream system behavior have both been corrected to ensure that this cannot happen again."
Following the blunder, Google notes that the "customer and Google teams worked 24x7 over several days to recover the customer's GCVE Private Cloud, restore the network and security configurations, restore its applications, and recover data to restore full operations."
Google also admitted that no "customer notification" was triggered because this was an inadvertent deletion done through Google's internal tools. The whole incident must have come as a shock for UniSuper.
That said, there was conflicting information about whether UniSuper's backups stored in Google Cloud Storage were actually deleted or not, as Ars Technica points out. Initially, UniSuper claimed it had to rely on third-party backups because its Google backups were gone too. But Google's blog states the cloud backups were unaffected and "instrumental" in the restoration.
To their credit, Google has promised broad "remediation" steps to ensure this can never happen again. They've nuked the problematic internal tool and moved that functionality to customer-controlled interfaces. They've also scrubbed their databases and confirmed no other Google Cloud accounts are improperly configured for deletion.
The company reiterated that robust deletion safeguards exist, including soft deletes, advance notifications, and human approval checks.
It's certainly an alarming event for millions of cloud customers, but Google has emphasized this was an "isolated incident" affecting a single customer. They insist there are no systemic issues that put other Google Cloud customers at risk of spontaneous data vaporization.