Friday, December 16, 2011

Understanding Security in Cloud Storage

The Cloud
Could is the new black. With so much buzz around cloud, it is hard to distinguish the meaningful and relevant parts to business customers. Cloud has become synonymous with anything that runs on the Web. Generally speaking for an offering to be considered cloud it must be available over the Internet, and capable of supporting large numbers of users simultaneously without significant changes to its architecture.
The cloud promises radical simplifications and cost savings in IT. Leveraging their technical expertise and economic of scale, several technology powerhouses including Amazon, Rackspace, Google and Microspft, have deployed a wide array of cloud offerings.
When it comes to security, it is useful to differentiate among the different cloud systems: Software as a Service, cloud compute and cloud storage. Each system poses its own set of benefits and security issues.
Software as a service (SaaS), represented by applications like Salesforce.com, Google Docs. Quickbooks Online and others, involves full software applications that run as a service in the cloud. Tens of thousands of companies share the common infrastructure of Salesforce.com. These companiess maintain control of sensitive customer information through a combination of secure credentials and secure connections to Salesforce.com. Companies that use Salesforce tolerate the risk of their data not being encrypted at the Salesforce.com. Because SaaS runs in the cloud, the data from customers must be visible to the applications in the cloud (either not encrypted or decrypted by the SaaS code). The main benefit of SaaS is to reduce the complexity of having to configure and maintain software in-house. The success of Salesforce.com and others demonstrates that many companies have traded security concerns for the sheer utility and cost savings of not having to run their software in-house.
Cloud compute allows customers to run their own applications in the cloud. Amazon's Elastic Compute Cloud or EC2 represents  this type of system. Customers upload their applications and data to the cloud where vast compute resources of EC2 can be applied to the data. Virtualization provides a practical vehicle to transfer compute environments and share physical compute resources in the cloud. This approach has been used successfully by financial institutions and the life sciences to solve heavy compute models. It is expensive to run data centers full of servers to run complex mathematical models. The idea of sharing a compute infrastructure with other customers makes good economic sense. In a compute cloud the data can be anonymized, however it cannot currently be en-crypted. That is, it is possible to obfuscate the data in such a way that is difficult for anyone to see what the data means; however in order to have a computer in the cloud operate on a data set, with today's technology, the data set must be visible to that computer (i.e, not encrypted).
Cloud storage allows customers to move the bulk of data to the cloud.
Microsoft's Windows Azure storage services and Amazon's Simple Storage Service (S3) are good examples.

Security Concerns in the Cloud
Armed with the knowledge about the different types of cloud offerings --SaaS, compute and storage--we can now examine the major concerns that are keeping businesses from putting sensitive information in the cloud.
Data Leakage
Many businesses that would benefit significantly from using the cloud are holding back because of data leakage fears. The cloud is a multi-tenant environment, where resources are shared. It is also an outside party, with the potential to access a customer's data. Sharing hardware and placing data in the hands of a vendor seem, intuitively, to be risky. Whether accidental, or due to a malicious hacker attack, data leakage would be a major security violation.
While data leakage remains an unsolved issue in SaaS and cloud compute, encryption offers a sensible strategy to ensure data opacity in cloud storage. Data should be encrypted from the start so that the possibility of the cloud storage provider being somehow compromised poses no additional risk to the encrypted data.
With cloud storage, all data and metadata should be encrypted at the edge before it leaves your data center. The user of the storage system must be in the control of not only the data, but also the keys used to secure that data. From a security perspective, this approach is essentially equivalent to keeping your data secured at your premises. It is never acceptable to encrypt data at an intermediary site before transmission to the cloud, as this allows the intermediary site to read the data. Futhermore, any encryption scheme must not rely on secrecy(other that the actual key), obscurity, or trust.
Customer Identification
Cloud credentials identify customers to the cloud providers. This identification is a key line of defense for the SaaS.