Facebook’s SSD findings: Failure, fatigue and the data center

Post by Robin Harris (thank you)

SSDs revolutionized data storage, even though we know little about how well they work. Now researchers at Facebook and Carnegie-Mellon share millions of hours of SSD experience.

Millions of SSDs are bought every year. It’s easy to be impressed by fast boots and app starts. But what about 24/7 data center operations? What are the common problems that admins should be concerned about?

Read on here

FCoE is dead! (For real)

Good post by Enrico Signoretti (thank you)

How many times have you heard this statement? Tape is dead! Mainframe is dead! So on and so forth… it turned out not to be true, most of the times it was just a way to say that a newer technology was seeing a strong adoption, so strong as to eclipse the older one in the eyes of the masses. But, in the case of FCoE, it is slightly different.

Read on here (also look at the comments section !)

VSAN Evaluation – How to use the sizing tool – Part 1

Post by Rafael Kabesa (thank you)

Sizing a vSphere environment for VMware Virtual SAN is not an easy task and if this is how you look like when you approach it you should continue reading. Nevertheless, even if you already did sizing for VSAN in the past there are some nuances that you might want to pay attention to.

Read on here

Using RAID-5 Means the Sky is Falling!

Good post by Olin Coles (thank you)

Why disk URE rate does not guarantee rebuild failure.

Today’s appointment brought me out to a small but reliable business, where I’m finishing the hard drive upgrades for their cold storage backup system. It was an early morning drive into the city, with enough ice on the roads to contribute towards the more than 30,000 fatality accidents that occur each year1. The backup appliance I’m servicing has received 6TB desktop hard disks to replace an old set with a fraction of the capacity, so rebuilding the array has taken considerable time.

Read on here

Data Protection: All Starts with an Architecture

Post by Edward Haletky (thank you)

At The Virtualization Practice, we have systems running in the cloud as well as on-premises. We run a 100% virtualized environment, with plenty of data protection, backup, and recovery options. These are all stitched together using one architecture: an architecture developed through painful personal experiences. We just had an interesting failure—nothing catastrophic, but it could have been, without the proper mindset and architecture around data protection. Data protection these days does not just mean backup and recovery, but also prevention and redundancy.

Read on here

Why Big Disk Drives Require Data Integrity Checking

Good post  by Stephen Foskett (thank you)

Hard disk drives keep getting bigger, meaning capacity just keeps getting cheaper. But storage capacity is like money: The more you have, the more you use. And this growth in capacity means that data is at risk from a very old nemesis: Unrecoverable Read Errors (URE).

Read on here

Recalculating Odds of RAID5 URE Failure

Good Post by Matt Simmons (thank you)

Alright, my normal RAID-5 caveats stand here. Pretty much every RAID level other than 0 is better than a single parity RAID, until RAID goes away. If you care about your data and speed, go with RAID-10. If you’re cheap, go with RAID-6. If you’re cheap and you’re on antique hardware, or if you just like arguing about bits, keep reading about RAID-5.

Read on here

VMware Virtual SAN Operations: Replacing Disk Devices

Good post by Rawlinson Rivera (thank you)

In my previous Virtual SAN operations article, “VMware Virtual SAN Operations: Disk Group Management” I covered the configuration and management of the Virtual SAN disk groups, and in particular I described the recommended operating procedures for managing Virtual SAN disk groups.

In this article, I will take a similar approach and cover the recommended operating procedures for replacing flash and magnetic disk devices. In Virtual SAN, drives can be replaced for two reasons; failures, and upgrades. Regardless of the reason whenever a disk device needs to be replaced, it is important to follow the correct decommissioning procedures.

Read on here

HDD warming: global data threat?

Interesting Read from Robin Harris (thank you)

If there’s one thing I hate, it’s unsettled science. For instance: the effect of temperature on disk drives. Shorten their life or not? Most studies say no – including a new one – but Microsoft researchers disagree. Can’t we all just get along?

The folks at Backblaze published a detailed blog post on observed effects of temperature on disk drives. Like most studies, they didn’t find one: After looking at data on over 34,000 drives, I found that overall there is no correlation between temperature and failure rate.

Read on here

What happens in a VSAN cluster in the case of an SSD failure?

Post by Duncan Epping (thank you)

he question that keeps coming up over and over again at VMUG events, on my blog and the various forums is: What happens in a VSAN cluster in the case of an SSD failure? I answered the question in one of my blog posts around failure scenarios a while back, but figured I would write it down in a separate post considering people keep asking for it.

Read on here