The Danger Sidekick Microsoft Fiasco: Don't Blame the Cloud (UPDATED)

UPDATE:
If you make enough noise, you may get your data back.  That's the message from Microsoft's Roz Ho: 

On behalf of Microsoft, I want to apologize for the recent problems with the Sidekick service and give you an update on the steps we have taken to resolve these problems.

We are pleased to report that we have recovered most, if not all, customer data for those Sidekick customers whose data was affected by the recent outage. We plan to begin restoring users' personal data as soon as possible, starting with personal contacts, after we have validated the data and our restoration plan. We will then continue to work around the clock to restore data to all affected users, including calendar, notes, tasks, photographs and high scores, as quickly as possible.

Fortunately, Sidekick users should now start seeing their data being restored.  Although I still believe this is still a systems management failure, TechCrunch reports that "Microsoft has made changes to improve the overall stability of the Sidekick service and initiated a more resilient backup process to ensure that the integrity of their DB is maintained." 

Now why didn't they just do that in the first place?


Danger, Inc.
By now we have heard all the excuses and rumors.  T-Mobile and Danger, the Microsoft-owned subsidiary that makes the Sidekick, announced that they lost all user data. That means that any contacts, photos, calendars, or to-do lists that haven't been locally backed up are gone.

Of course the common refrain we hear is that this is all the fault of cloud computing.

InformationWeek: "code red cloud disaster."

ZDNet: "one of the biggest cloud computing disasters so far."

CNet"threatens to put a dark cloud over the company's broader 'software plus services' strategy."

Barron's: "you have to wonder if the high hopes about cloud computing just suffered a mammoth setback."

Still others are blaming Microsoft. 

Writes Daniel Ionescu: "...the unfortunate coincidence is Microsoft's launch of Windows Mobile 6.5 devices last week, which in association with this weekend's Sidekick data loss could translate into reduced customer trust from potential Windows Phone buyers."

John Paczkowski: "Microsoft hasn't yet said what caused the failure, though some speculate it was a bungled storage area network upgrade performed without backup."

Dan Nosowitz: "It's been more than two weeks without data for Sidekick users, and T-Mobile finally bit the bullet and announced that it probably isn't coming back."

Michael Hickins: "Microsoft's Sidekick/Danger issue doesn't reflect badly on cloud storage, it reflects on cloud storage done badly."

And Jason Kincaid: "This goes beyond FAIL, face-palm, or any of the other Internet memes we've come to associate with incompetence. The fact that T-Mobile and/or Microsoft Danger don't have a redundant backup is simply inexcusable, especially given the fact that the Sidekick is totally reliant on the cloud because it doesn't store its data locally."

So clearly, we have a large scale "trust failure" on the part of Danger, and its parent company Microsoft.

As we've said before in this blog, large providers are not necessarily trusted providers.  But there is something else I's like to emphasize: this was not a failure of the cloud.  Rather, it was a management failure.  It is a failure by Danger to execute common sense server maintenance practices. Microsoft failed to ensure that sound systems management, backup and recovery policies were being followed, and T-Mobile failed to check the service delivery partners for the appropriate capabilities.

Again, it was NOT a failure of the cloud.

Unfortunately, the damage has been done to the reputation of the cloud.

For smaller service providers, this is an opportunity to take on Microsoft.

The issue here is not that the cloud failed, it is that the cloud provider failed.  We work with our service provider partners to architect appropriate services. It is ok to have a service that is priced low for average levels of service and priced at a premium for bullet-proof service delivery. And let's agree that service level is different from data loss.  No one expects their data to be lost, particularly by a service provider.  At a minimum, in any cloud storage service, there should be a second, recoverable copy from an off site location.  So, this is simple, the only way your data is lost is if two geographically remote data centers are wiped out.  This assumes that your testing and recovery capabilities absolutely eliminate data corruption problems. Usually, this is the problem. It is not a lack of multiple copies in multiple locations, rather, it is a system failure sourced from system architecture or underlying data corruption.

No service provider is ever going to place themselves in the position of ultimate total responsibility for every circumstance that could lead to data loss.

So what does the user do and on whom do they rely? At a minimum, the user must understand the data backup policy and commitment of the service provider.  Test them and make them prove they can recover.  And do business with someone who values your business and earns it every day. Do business with someone whom you can actually reach, and will enter into a constructive dialog with you to earn and retain your business.

Sponsors

About this Entry

This page contains a single entry by Steve Lesem published on October 14, 2009 11:56 AM.

CFOs: Questions to ask your CIO about Cloud Computing was the previous entry in this blog.

BMC's Tideway Acquisition: A Stairway to the Cloud? is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.