Cloud backup: What does 11 9s durabilities really mean? - Techies Updates

Breaking News

Wednesday, July 18, 2018

Cloud backup: What does 11 9s durabilities really mean?

Cloud vendors need trust. So they claim 99.999999999 percent availability. Huh? Why not 15 or 17 9s? Because after nine 9s, it really doesn't matter. Here's why.


What do 11 9s durabilities really mean?

Marketing's hype for is the cloud. Cloud vendors need trust. So they claim 99.999999999 percent availability. Huh? Why not 15 or l7 9s? Here's the deal.

The good folks at Backblaze published a post today on the truth behind cloud durability claims. If you want the gory details, go read the whole thing.

If not, keep reading.

FORGET THE MATH

The math behind availability numbers is impressive. But the math depends on the assumptions behind it.

The assumptions are key. And they are:

Average rebuild time, which isn't what you think it is.
Annualized drive failure rate, or AFR. Also not what you think.
Read also: More enterprises are going 'all-in' with select cloud providers

All very scientific, except they're beside the point.

WHY?

Modern cloud storage is a tech wonder. Active cloud storage typically has three copies of your data. Partly for availability, but also for performance, since 7200 RPM disks have more than 8ms of rotational latency, and with three you cut that number significantly.

Backup storage -- what Backblaze offers -- can't afford three copies, so they use fancy erasure codes to protect data. These systems break data into shards with some mathematically intense parity that typically enables data to survive four drive failures with little impact on storage capacity.

Vendors don't wait until four drives fail to take corrective action. The shards are spread over the infrastructure so no single power supply, switch, or rack can take out more than one shard. And rebuilds are usually highly parallel, so the failures are repaired a lot faster than a home user would expect.

THE POINT?

As the Backblaze post notes, 11 9s availability is irrelevant because:

Somewhere around the 8th nine, we start moving from practical to purely academic. Why? Because at these probability levels, it's far more likely that:

An armed conflict takes out data center(s).
Earthquakes / Flood / Pests / or other events known as "Acts of God" destroys multiple data centers.
There's a prolonged billing problem and your account data is deleted.

ASTEROIDS V BILLING?
As it notes:

You change your credit card provider. The credit card on file is invalid when the vendor tries to bill it. Your email service provider thinks billing emails are SPAM. You don't see the emails coming from your vendor saying there is a problem. You do not answer phone calls from numbers you do not recognize. Customer Support is trying to call you from a blocked number, they are trying to leave voicemails but the mailbox is full.

DIRECTORY

Thereby hangs a tale -- with an unhappy ending.

DESIGNING FOR FAILURE

Behind the marketing though are serious engineering teams that sweat the details, from architecture to root cause analysis of unexpected events. We've come a long way from the oversold RAID arrays of the 1990s whose data loss rates were far above what the optimistic assumptions predicted.

THE STORAGE BITS TAKE

Is a full voicemail box more likely to result in data loss than an asteroid? Since no asteroids have hit Earth in the last 20 years and people have lost data due to billing issues, yeah, clean out your voicemail.

The important point is that for backup purposes -- which means you only need access occasionally -- any provider with credible numbers is good enough. So then the decision points come down to price and privacy.

The former is easy to figure. The latter? Well, the client software should encrypt your data before it leaves your system with some suitably expensive algorithm like AES-256. Once the data is sharded at the backup site, it is pretty much unreadable unless the entire data center -- and your password -- are hacked. No single drive will have useful recoverable data.

Active data -- a website, or database -- that needs constant access is another matter. There are a lot more moving parts, and nobody claims 11 9s availability.

But that's a post for another time.

Courteous comments welcome, of course. I've been a satisfied customer of Backblaze for several years. Other than that, I have no commercial relationship with them.




1 comment: