In every cloud survey, security consistently comes out as an inhibitor to cloud adoption. Even though this has been the case for several years, many feel that it is a temporary barrier which will be resolved once cloud offerings get more secure, mature, certified, and thus accepted. But is this indeed the case or do we need another approach to overcome this barrier?
During a recent cloud event, two speakers from a large accounting and EDP auditing firm took the stage to discuss the risks of cloud computing. While one speaker dissected the risks for both consumers and providers of cloud services, the second speaker discussed the various certifications and audit schemes that are available in each area. They acknowledged that with the currently available certifications, not all risks were covered, but their envisioned remedy was even more comprehensive certifications and audits. Now, this may come as no surprise given the speakers’ backgrounds, but more “paperwork” simply won’t address what IT pros are really worried about. Let me try and explain my thinking, including how the recent WikiLeaks events influenced this.
Security is often cited as a concern with regard to cloud adoption. My view is that the apprehensions are more the fear of losing control (not being able to restore service when needed), not primarily the fear of losing data. Fear of losing data can be addressed by cloud providers through implementing security solutions as described in various posts on the CA security management blog, but fear of losing control cannot.
The big difference between traditional IT and cloud computing is that cloud computing is delivered “as a service.” With traditional IT we bought a computer and some software. In case it did not work we could fix it ourselves (sometimes a firm kick would suffice). No matter what happened (good or bad), we were the master of our own destiny. And even with traditional outsourcing, we often told the outsourcer “what to do,” and in many cases “how to do it.” If push came to shove and the outsourcer really screwed up, we could — at least in theory — still say “Move over, let me do it myself.”
When something is delivered as a service, there is no equipment to kick and we no longer can say “Move over, I’ll do it myself.” We likely won’t even be allowed to enter the room where the equipment is located or get access to the underlying code and data. If your biggest customer (or your boss, or the boss of your boss) is on the phone screaming at you, that is not a position many people want to find themselves in. And believe me, showing all the certificates and audit reports that your vendor accumulated and shared with you, will not quiet them down, even assuming that the vendor at that moment is doing its best to fix the problem. But what if the vendor has made a conscious decision to discontinue rendering the service – as seems to be the case with WikiLeaks?
Now you may feel your organization would never do something that would warrant or even cause such behavior by your vendor. But what if a judge ordered your vendor to discontinue the service? Something that can happen and has happened, sometimes because of really small legal technicalities or unintended incidents like a server sending spam or an employee collecting illegal content on a company server. Google and other mail providers have been ordered to cease mail services to both consumers and business, and have complied. Sure you can go to court and appeal, but will that be quick enough?
For each “as a service” service we will need to evaluate what is reasonable risk and what to do to remedy the unreasonable risks. What is reasonable will very much depend on the type of industry. In the following examples we look at scenarios of the service not working (outage), and the data being stolen. Some incidents the business may hardly notice, others can be severely inconvenient, but others could jeopardize overall business continuity (not being able to invoice or missing a deadline on a project with severe penalty clauses).
- Email: If email is down but phones, instant messaging, text messaging and maybe the occasional fax are still available, then a few days outage may be reasonable (for some companies). Provided we get all of our email back at the end of the outage, regardless of whether we moved to a new provider or the old one finally got it fixed or switched us on again. With regard to theft: nobody likes their personal conversations discussed in public (see again the WikiLeaks example) so measures like encryption, digital signing, using SSL and working with reputable (OK, let’s call them certified) vendors are in order.
- CRM: This system tells us what our sales team has been up to. Before we implemented CRM (fairly recent in many cases) we had limited insight into sales activities, so it seems reasonable that a week of outage is fine (again, depends on your industry). With regard to theft, these are often records about people, so legal and privacy requirements apply, not to mention that you may not want this data to show up at your direct competitor.
- Invoicing, order intake, reservation management: Very much depends on the industry, but for some industries a single hour of outage at the wrong moment can already mean bankruptcy. In this case, you probably want to have a hot swappable system, preferably at two different “as service” vendors.
- Project management: Depends; if you are a system integrator with penalty clauses or an innovator rushing towards a product launch, it may be critical.
- Bookkeeping: Depends (before end of month closing?).
I could go on and on, but I’m sure you get the point. For each service that you would consider moving into the cloud, you have to determine the importance, criticality and impact of disruptions (I am sure you do this all the time for all your services ;-)). This exercise may actually save you lots of money. Most services are not under-provisioned but over-provisioned. In case of doubt, IT tends to move services to the more secure, more reliable, more failover equipped platform. A famous example is the company that was running its internal employee entertainment Tour de France betting system on a hot swappable dual everything nonstop system.
Next, for each service you must determine what a reasonable recovery period is, and how to implement it. It could be simple source code escrow (with the right to keep using the code) and a failover contract with a nearby infrastructure provider. Or it may require having a fully up-to-date system image ready to provision within an hour. For other scenarios, you may be running two instances of your service or application, in parallel at two separate service providers on different grids, different networks and in different jurisdictions. And for some you may not bother. It’s like insurance: most people insure their house against fire (as they could not overcome the financial impact if it burned down) but many do not insure their phones or cars against theft of damage (as they can afford to buy a new one if needed without going bankrupt, even though it may be “severely inconvenient”). There is also a case of being too cautious. I remember at my first employer, the bookkeeping department of the local plant would travel separately to the annual company outing (two by train and two by car), even though we had 12 factories located within a hundred miles, each with four bookkeepers. I am sure we would have closed the books somehow in case of a travel mishap.
Hopefully most of the services currently running in the cloud (CRM comes to mind) fall into the “severely inconvenient” category. If they are business critical, you hope the companies have a plan B that allows them to move these jobs quickly to another cloud if the need arises. To be able to do so easily, we will need two things: Standards that enable more portability than we have today, and automation tools that allow us to do this “semi-auto-magically.” Our accountant friends may claim you also need certifications on both the primary and the backup vendors, but I am sure these will remain in the desk drawer when push comes to shove.
A final thought on assuring your services in the cloud. On the insurance front we see that many people do not insure their house against natural events such as earthquakes, first because it is often not possible or affordable, but also because — as my father used to say — “if heaven drops down, we will all be wearing a blue hat.” Imagine if a video on-demand provider is the only one still running after an earthquake, how much good would it do them? In other words, it is all about being pragmatic.
P.S. During my economics study, at some point you had to decide whether to major in accounting or in IT. Guess what the more pragmatically inclined folks chose? 😉