We have been running the Cloud Academy roundtables in several European countries. I’d like to share some of the more interesting questions, debates and insights around a number of topics, starting today with RAIC—Redundant Arrays of Inexpensive Cloud Services. Other topics will include:
- A TV industry analogy: Competition for the IT department
- Cloud Shortcuts: Can the Cloud make( internal) IT more agile
- Service Level Management and the Cloud
- Cloud R&R – Retained responsibilities for IT
- Elastic Services: Everybody wants to be a manager
Redundant Arrays of Inexpensive Cloud services
Today’s post discusses whether we can ensure performance and availability of public cloud services. I’m not sure we can. Public cloud services are a bit like the weather: we are lucky if we can predict what it is going to be like, but cannot manage or change it as we don’t control the underlying elements. The same holds true trying to “manage” public cloud services.
So what do we do? Give up on public cloud services altogether? No, that would be throwing out the baby with the bathwater. Instead, we can follow a method we have been using in IT for a long time. If we cannot count on a certain item to be always available, we make sure we have a fail over option.
The best example comes from storage. At a certain moment, people realized that even the most expensive disks encountered failures now and then. So they developed a strategy where failure of an individual disk is not so important. The result was RAID, a redundant array of inexpensive disks that, transparently tot the user, served the requested data from other disks in the array when one of the disks failed. In typical IT fashion, we used the name RAID 0 for a configuration where we had no raid at all, RAID 2 for 2 disks etc. The benefit of higher raid numbers is is that the predicted availability increases significantly by adding marginally more redundant capacity.
How do we apply a similar “redundant array” approach to cloud services? The idea of contracting for two email services or two CRM systems is counter-intuitive for most IT folks, since for years we strived to standardize on one of each . And the reality is that if half the company uses one email system and the other half another, 50% of the people are still down if one fails. So instead of looking at email in isolation, we should look at all the employee communication options. These may include email, instant messaging , VOIP, even a social media functions similar to Facebook or Twitter. If based on different technologies and sourced from different vendors, the chances of them all being down at the same time is extremely unlikely.
Using public cloud services is another step in giving up control of the underlying components. Years ago, when companies bought the first computers , they were expected to program these themselves in Assembler. Later, they bought higher- level language compilers, followed by complete off the shelf software packages followed now by infrastructure and software as a service. Along each step, IT has lost some control, but in exchange we are no longer required to do all the work.
We do, however ,have to make conscious decisions when to cede control. This differs by industry, type of application and possible risk. Using public cloud services in many cases already makes sense today. But when using them, we need to have some way to monitor availability and outcome so that we can make smart or pragmatic tradeoffs and precautions when the services are not available.