We have the Amazon S3 APIs, Eucalyptus APIs, Rackspace Cloud Files APIs, Mezeo APIs, Nivanix APIs, Simple Cloud API, along with the standards proposed by the Storage Networking Industry Association (SNIA) Cloud Storage Technical Work Group, and more.
So what should you do or think about all this? What impact do these Cloud Storage APIs have on your decision-making? Just how important are they, and what's next?
Here's some information to aid your understanding of this emerging and important technology. Let's begin by answering two basic questions:
What is a Cloud Storage Application Programming Interface (API)?
A Cloud Storage Application Programming Interface (API) a method for access to and utilization of a cloud storage system. The most common of these are REST (REpresentational State Transfer) although there are others, which are based on SOAP (Simple Object Access Protocol). All of these are associated with establishing requests for service via the Internet.
What is REST?
REST is a concept introduced in the doctoral dissertation of Roy Fielding, and is widely recognized as an approach to "quality" scalable API design. The actual API design and capabilities are very dependent on the actual capabilities of the underlying Cloud Storage System
One of the most important REST capabilities is that it is a "stateless" architecture. This means that everything needed to complete the request to the storage cloud is contained in the request, so that a session between the requestor and the storage cloud is not required. Why is this important? The Internet is highly latent (it has an unpredictable response time and it is generally not particularly fast (when compared to a local area network (lan)). Once you get a request, there is no guarantee that you can ask a "qualifying question" of the requestor in a reasonable time period. So, REST is an approach that has very high affinity to the way the Internet works. Traditional file storage access methods that use NFS (network files system) or CIFS (Common Internet File System) do not work over the Internet, because of latency.
One other thing we should clear up: Cloud Storage is for files, which some refer to as objects, and others call unstructured data. Think about the "files" stored on your PC, like pictures, spreadsheets and documents. These have an extraordinary variability, thus "unstructured". The other kind of data is "block" or "structured" data. Think data base data, data that feeds transactional system that require a certain "guaranteed" or low-latency performance. Cloud Storage is not for this use case. IDC estimates that approximately 70% of the machine stored data in the world is unstructured, and this is also the fastest growing data type.
So, Cloud Storage is storage for files that is easily accessed via the Internet. This does not mean you cannot access Cloud Storage on a private network or LAN, which may also provide access to a storage cloud by other approaches, like NFS or CIFS. It does mean that the primary and preferred access is by a REST API. (Here are other terms you will see, RESTful, or RESTlike or RESTstyle, which is geekspeak for how closely the API conforms to the REST approach.)
Today, there are multiple definitions for Cloud Storage, and the one I prefer is "File Storage accessed through Web Services API's over a network". This represents the key attributes of file storage that is cloud storage, versus other types of file storage. Other key qualities of a storage cloud are:
- multi-tenant support (use by more than one unrelated user)
- geo location and geo replication, seamless and real time provisioning of accounts
- seamless and real time provisioning of accounts
- availability of "practically" unlimited amounts of storage "on-demand"
- "pay for use", which means that your payment is for actual storage used, over some time frame, usually a month.
There are many who are still arguing about what I have defined above, but what I've said is generally accepted by the industry. If it is a vendor doing the arguing I would suggest you check under their hood, usually you will find that they do not offer whichever of the above features they are trying to argue out of the definition.
Also, traditional storage vendors continue to proclaim the importance of local network access (like NFS, CIFS or ISCSI) for the purpose of Cloud Storage access by applications that today can only access via the older protocols. This requires that the application making the request be on the same local network (think same data center) as the storage cloud. Their reason for this view is that they are only just beginning to see application demand for storage cloud access via REST APIs, versus their traditional business model which serves an enterprise user with their own data center.
This is why Cloud Storage has generally emerged as a service offering in the IT Service Provider (also know as the WEB Hosting Industry) space first. In this space, there is no doubting the importance and future of REST API access to storage clouds, it is only viewed as an adoption speed issue. Note that within the data center, access to storage using an HTTP based protocol is not necessarily any slower than one of the more traditional protocols. API access has been labeled as being a slower form of access over NFS and CIFS. This view is largely due to the fact that it "may" be accessed over the Internet. In most cases, it is the network that adds the latency, not the means of access. Make no mistake, traditional storage vendors see this coming, and they will make offerings available in the near future.
REST APIs are language neutral and therefore can be leveraged, very easily, by developers using any development language they choose. Resources within the system may be acted on through a URL. So, an API is not a "programming language" it is the way a programming language is used to access a storage cloud. This is part of the basic understanding of APIs that is required to discuss the dreaded "vendor lock in" and upcoming "cloud lock in" discussions and understand the issues that surround these assertions.
REST APIs are also about changing the state of resource through representations of those resources. They are not about calling web service methods in a functional sense. The key differences between different Cloud Storage APIs are the URLs defining the resources and the format of the representations.
The Cloud Storage space is very young and everyone has their opinions on how things should be represented and accessed. Efforts are underway by organizations like SNIA, with their Cloud Data Management Interface (CDMI), to standardize both the resource structure and the representations. However, standards are not developed overnight and customers are demanding programmatic access to Cloud Storage now.
Current Cloud Storage vendors have produced a basic set of APIs that are accomplishing fairly similar things, and other APIs that expose the underlying unique functionality of the Cloud Storage platform supplying the storage cloud. You should expect that, over time, most storage clouds will provide the basic functions in somewhat similar ways, and further that additional advanced functions will be adopted and expected to be in every storage cloud offering.
Finally, you should look for a taxonomy of APIs, that includes basic file functions, advanced functions, Provisioning APIs, Billing APIs, and Management APIs. Storage clouds that become successful will offer all these capabilities, to increase the efficiency of their use.

Several efforts have been made to simplify the transition between vendors by providing an abstraction layer on top of the vendor's APIs. In this approach, a program library is created, for use in the application that needs cloud storage access, and this API translates (for the given program language) a single API into the API that is specific to a Cloud Storage offering. So, the application, which is using this library, writes their APIs once, and achieves portability between storage clouds that are supported by this approach.
This approach has been largely programming language specific and may take advantage of the language it was designed for. Good examples of this are jClouds, an open source cloud storage abstraction library written in Java, and Simple Cloud API, a collaboration of vendors including Microsoft, Rackspace, Nirvanix, IBM and Zend which provides a simplified Cloud Storage interface for PHP developers. While extremely useful for developers, these abstractions tend to expose the lowest common denominator relating to Cloud Storage functionality and may omit critical features, for example only providing namespace object access as opposed to ID access.
So, let's discuss lock-in, the term used to express concern that once a vendor has gotten you to exploit their architecture and technology, they will recognize that you are committed to them and cannot easily move away. As a result, they will then raise their prices and take advantage of your lock in status, keeping their price just below the amount that would encourage conversion away from their technology and towards a more "open" set of capabilities. Let's look at all the "dreaded" examples that have been surfaced around cloud storage and as a reason to slow it's adoption:
1. API lock in, which means your interaction with a storage cloud uses the APIs of that storage cloud, and suggests that you cannot easily move to another providers cloud with their own, different APIs.
2. Vendor lock in, which means that since you are condemned because of your application development activity with specific APIs to use only a cloud from a specific supplier.
3. Device lock in, meaning that you developed a cloud storage based program utilizing the APIs of that specific cloud, for a specific device (generally a PDA) that has specific functionality. This is double lock in, both the device programming methodology and the API selection.
4. Browser lock in, meaning that programming to specific APIs can also be rendered unique based on the Web browser that is selected.
5. Programming language lock in, which means that you have written the APIs in a language like Python, or JAVA, or .NET, or whatever.
6. API wrapper lock in, which means that you incorporated libraries into your application that allows your application to write generic APIs, which are then translated by these APIs to the correct API for the desired storage cloud (this is what Simple Cloud API is).
So, as you can see here, utilizing cloud storage could ultimately have you locked in on at least six levels!
With this much opportunity for vendor abuse, why are developers rushing to write Web based applications that utilize cloud storage services via API access? Are they simply uncontrolled, unthinking rebels who will shortly learn the error of their ways? Have they made a fatal error? Or do they know something you don't?
First, learn about Cloud Storage APIs. What they do is make storage programmable, and they abstract storage from the application. They offer advanced functionality (the programmable word) that makes it faster and easier to write the applications that are scalable versus the traditional storage access approaches. When you add these two capabilities to the storage cloud offering of low cost, availability in multiple locations, seamless provisioning, ease of adding additional storage, and the pay for use model, the case for the cloud has become compelling.
Where are we seeing early adoption: at service providers, because they host Web based applications and SaaS (usually Web based) applications, and this is where the developers who recognize the opportunity are focused.
What is coming: the introduction of this technology into the enterprise, complete with the adoption of the RESTful API technology. This will ultimately lead to a level of cooperation between service providers and the enterprise that has long been predicted. Enterprises will move to an IT modeled on an OPEX model, and expect their applications to be provisioned and interacting with service provider clouds, via APIs. IT Service Providers are racing to build the clouds to provide for this emerging business opportunity.
So, what about the lock in mentioned above. Sit down with your developer, they will show you why they don't feel "locked in". They will show you that you can quickly recraft your current APIs, in the programming language of your choice, to utilize the new APIs of the desired cloud. For this reason, Simple Cloud API will likely be a short term measure, which precedes base case APIs that are extremely similar, and goes through a market led process to identify "best practice" APIs for both base case and advanced function, as well as all the other API led capabilities as mentioned above. In short, vendor lock in is not the problem for this technology that it has been for others. Also, the ingenuity and resourcefulness of all the suppliers, standards groups, and market adoption scenarios will continue to mute your ability to be lock in free.
Your real challenge is not lock -in, but rather how to adopt this new set of capabilities, and solve problems and create opportunities with your IT solutions as rapidly as possible. Standing on the sidelines waiting for this one to resolve will keep you out of a great opportunity, because we still have several meaningful years of rapid change associated with this technology adoption cycle.




