Recently in Cloud Taxonomy Category

We define hybrid cloud storage as utilization of private cloud storage at an enterprise data center, or a private cloud hosted by an IT service provider with some combination of additional IT service provider-based public and/or private cloud storage.  

In a recent post, Cloud Storage for the Enterprise - Part 1:  The Private Cloud, we covered the definition and requirements of cloud storage as an enterprise solution, and as a technology deployed within enterprise-owned data centers (or at least within their co- location racks and cages).  Fundamentally, a private cloud is also a non multi-tenant cloud (i.e., used by only one entity or related parties within an enterprise or a public sector agency) that is behind the firewall(s).  An additional solution that many enterprises are contemplating is the hybrid cloud, and we will look at the aspects of that solution in this post.  

Before we begin our investigation of hybrid cloud, let's review some of the basics.  The following diagram reviews the differences between public and private clouds:

public_private_clouds.gif
Figure 1.   Comparison of public and private cloud

Many enterprises are beginning their cloud evaluation with a "private cloud."  I extend the definition of private cloud to be a "single tenant" cloud, as some enterprises may chose to use a single tenant cloud hosted at a service provider, versus hosting their cloud within their own data centers.  In the following diagram, we show two private clouds, connected via policy-based replication in two data centers.  This provides the assurance of backup and disaster recovery that many enterprises require.  A third location could easily be added for even higher levels of backup and disaster recovery.

pvate_cloud_entpse.gif
Figure 2.   Private cloud inside an enterprise.

The growth of storage is driving increased costs, and the enterprise is on a continuous search to improve the way they can cost-effectively manage this growing data.  The primary difference between hybrid cloud and private cloud is the extension of service provider-oriented low cost cloud storage to the enterprise.  The service provider based cloud may be a private cloud (single tenant) or a public cloud (multi-tenant).  There are several implementations of hybrid cloud, and several examples are included.   The service provider cloud may enable enterprises to leverage the volume efficiencies of the service providers to realize additional savings. 

A hybrid cloud provides a way of securely using service provider-based cloud storage in combination with enterprise clouds.  Another implementation could be use of single tenant service provider-based private clouds at multiple locations. 

Some examples of hybrid clouds are offered for your consideration, although not every potential approach is covered herein:

hybd_cloud.gif
Figure 3.  Hybrid cloud variation 1: private cloud inside
an enterprise affiliated with a public cloud via a ser
vice provider.

hybd_cloud2.gif
Figure 4.  Hybrid cloud variation 2: private cloud inside
an enterprise with affiliated private cloud via a service provider.


hybd_cloud3.gif
Figure 5. Hybrid cloud variation 3: Private clouds at a
service provider with multiple clouds.

Since the primary motivation for hybrid cloud is economics, let's begin the discussion with an understanding of the economics of cloud storage and then extend that discussion to the hybrid cloud environment. 

The primary cost components of cloud storage include:

1.    Data center occupancy - leased (co-location) or owned and depreciated.
2.    Data center environmental - utilities, cooling, heating, etc.
3.    Storage hardware (leased expense or capital requirements & associated depreciation).
4.    File system and storage management (may be bundled in the storage hardware).
5.    Cloud enablement or platform (discreet or bundled with the storage system).
6.    Systems management and operational overhead.
7.    Backup and disaster recovery.

While it can be argued that the economics at a large scale enterprise are very similar to those at a service provider, listed below are some of the most common reasons enterprises do turn to service providers for their technology solutions:

1.    Capital conservation.
2.    Distraction associated with infrastructure management.
3.    Desire to outsource functions that are required but not associated with core competency (focus dilution).
4.    Poor history of infrastructure management.
5.    Specific issues, for example, out of data center space and not projecting long term needs to add additional data centers, or unable to expand existing data centers and no desire for an additional site.
6.    Redundancy of networks available in data centers that may not be available in the enterprise with assuming additional costs.

Whatever the reason, service providers can solve these problems.  In each of the three hybrid cloud scenarios, there are costs and security tradeoffs that each cloud use-case will consider.  For example, in hybrid cloud variation #1, the economics can be quite appealing, but there are significant security concerns.  One approach to mitigate these concerns is to encrypting an object before replication to a public cloud might mitigate the threat.

Understanding where key functionality is applied in your cloud stack is critical for successful implementation and highly dependent on the cloud and storage subsystem technology, cloud interoperability capabilities, and data use case.  Critical technologies that provide benefits are: de-duplication, compression, encryption for data at rest and data in motion, geo location, geo replication, tagging and search capabilities, and cloud access methods.  I will address underlying cloud technology requirements for the enterprise in my next post.

Cloud Use Case Definitions:

Data Archiving - Storing data for retention management requirements (such requirements may be internally generated, or associated with regulatory and compliance needs).  Archive data must be highly secure, highly reliable over the archive period, and easily searchable.  Archive data is generally encrypted, compressed and stored in a proprietary format. Access to the data is usually very infrequent and thus typical enterprises have leveraged slower access, cheaper tape media or redundant NAS to control costs.  Typical data issues associated with archiving are maintaining the archive and eliminating what is known as bit rot of the data, which is where data becomes corrupt if stored in the same media for long periods of time and not accessed.

Data Backup - Storing data as a replacement copy in the event the original copy is somehow damaged or lost due to user error, system failure, or as a result of a disaster scenario.  Back up data may or may not need to be highly secure or easily searchable, but must be available for quick restore when needed.  This data is also generally encrypted, compressed and stored in a proprietary format. Access to the data is more frequent than with archive data and can be at any level of the organization.  A single file, user, server, site, or the entire enterprise could potentially need to be restored to proper service and backup data must support these highly variable access needs.

Data Access - Storing data in its original format for access by users or other applications.  This type of data is frequently accessed and is the superset of the data that comprise backup and archive data.  Access takes precedence over security, but needs to be easily and quickly searchable and retrievable by users and applications and thus highly available.  Typical issues with access data are the need for fast accessibility of frequently used data balanced against the overall cost associated with storing all the data.  Enterprises often implement tier strategies to stage data in progressively lower cost media based on frequency of access.

hybd_cloud_eq.gif
 Figure 6. Hybrid enterprise use case cloud technology requirements.

Hybrid cloud storage, which we have loosely defined as utilization of private cloud storage at an enterprise data center, or a private cloud hosted by an IT service provider with some combination of additional IT service provider-based public and/or private cloud storage, offers an approach that allows use case, economics and security to prevail when selecting the appropriate approach.  Implementation will also be driven by the technological capabilities of the three building blocks of cloud storage, the cloud abstraction layer, file/object system choice and storage subsystem hardware.

So, our discussion of hybrid cloud storage has likely demonstrated at least one significant additional aspect, and that is complexity.  Starting with use case definition and security requirements, combined with a clear understanding of the unique issues within each enterprise that effect cost, you can map a clear path to the cloud technology and selection of one or more cloud service providers.  Finally, the trusted service provider continues to be another significant requirement for exploitation of hybrid cloud.
As the industry announcements on Cloud Storage APIs keep coming, the confusion surrounding what they mean keeps growing.

We have the Amazon S3 APIs, Eucalyptus APIs, Rackspace Cloud Files APIs, Mezeo APIs, Nivanix APIs, Simple Cloud API, along with the standards proposed by the Storage Networking Industry Association (SNIA) Cloud Storage Technical Work Group, and more. 

So what should you do or think about all this? What impact do these Cloud Storage APIs have on your decision-making? Just how important are they, and what's next?

Here's some information to aid your understanding of this emerging and important technology.  Let's begin by answering two basic questions: 

What is a Cloud Storage Application Programming Interface (API)?
    
A Cloud Storage Application Programming Interface (API) a method for access to and utilization of a cloud storage system.  The most common of these are REST (REpresentational State Transfer) although there are others, which are based on SOAP (Simple Object Access Protocol).  All of these are associated with establishing requests for service via the Internet. 

What is REST? 
REST is a concept introduced in the doctoral dissertation of Roy Fielding, and is widely recognized as an approach to "quality" scalable API design.  The actual API design and capabilities are very dependent on the actual capabilities of the underlying Cloud Storage System

One of the most important REST capabilities is that it is a "stateless" architecture.  This means that everything needed to complete the request to the storage cloud is contained in the request, so that a session between the requestor and the storage cloud is not required.  Why is this important?  The Internet is highly latent (it has an unpredictable response time and it is generally not particularly fast (when compared to a local area network (lan)).  Once you get a request, there is no guarantee that you can ask a "qualifying question" of the requestor in a reasonable time period.  So, REST is an approach that has very high affinity to the way the Internet works.  Traditional file storage access methods that use NFS (network files system) or CIFS (Common Internet File System) do not work over the Internet, because of latency.

One other thing we should clear up:  Cloud Storage is for files, which some refer to as objects, and others call unstructured data.  Think about the "files" stored on your PC, like pictures, spreadsheets and documents.  These have an extraordinary variability, thus "unstructured".  The other kind of data is "block" or "structured" data.  Think data base data, data that feeds transactional system that require a certain "guaranteed" or low-latency performance.  Cloud Storage is not for this use case.  IDC estimates that approximately 70% of the machine stored data in the world is unstructured, and this is also the fastest growing data type.

So, Cloud Storage is storage for files that is easily accessed via the Internet.  This does not mean you cannot access Cloud Storage on a private network or LAN, which may also provide access to a storage cloud by other approaches, like NFS or CIFS.  It does mean that the primary and preferred access is by a REST API.  (Here are other terms you will see, RESTful, or RESTlike or RESTstyle, which is geekspeak for how closely the API conforms to the REST approach.) 

Today, there are multiple definitions for Cloud Storage, and the one I prefer is "File Storage accessed through Web Services API's over a network".  This represents the key attributes of file storage that is cloud storage, versus other types of file storage.  Other key qualities of a storage cloud are:

  • multi-tenant support (use by more than one unrelated user)
  • geo location and geo replication, seamless and real time provisioning of accounts
  • seamless and real time provisioning of accounts
  • availability of "practically" unlimited amounts of storage "on-demand"
  • "pay for use", which means that your payment is for actual storage used, over some time frame, usually a month. 

There are many who are still arguing about what I have defined above, but what I've said is generally accepted by the industry.  If it is a vendor doing the arguing I would suggest you check under their hood, usually you will find that they do not offer whichever of the above features they are trying to argue out of the definition.

Also, traditional storage vendors continue to proclaim the importance of local network access (like NFS, CIFS or ISCSI) for the purpose of Cloud Storage access by applications that today can only access via the older protocols.   This requires that the application making the request be on the same local network (think same data center) as the storage cloud.  Their reason for this view is that they are only just beginning to see application demand for storage cloud access via REST APIs, versus their traditional business model which serves an enterprise user with their own data center. 

This is why Cloud Storage has generally emerged as a service offering in the IT Service Provider  (also know as the WEB Hosting Industry) space first.  In this space, there is no doubting the importance and future of REST API access to storage clouds, it is only viewed as an adoption speed issue.  Note that within the data center, access to storage using an HTTP based protocol is not necessarily any slower than one of the more traditional protocols. API access has been labeled as being a slower form of access over NFS and CIFS. This view is largely due to the fact that it "may" be accessed over the Internet. In most cases, it is the network that adds the latency, not the means of access. Make no mistake, traditional storage vendors see this coming, and they will make offerings available in the near future.

REST APIs are language neutral and therefore can be leveraged, very easily, by developers using any development language they choose. Resources within the system may be acted on through a URL. So, an API is not a "programming language" it is the way a programming language is used to access a storage cloud.  This is part of the basic understanding of APIs that is required to discuss the dreaded "vendor lock in" and upcoming "cloud lock in" discussions and understand the issues that surround these assertions.

REST APIs are also about changing the state of resource through representations of those resources. They are not about calling web service methods in a functional sense. The key differences between different Cloud Storage APIs are the URLs defining the resources and the format of the representations.
 
The Cloud Storage space is very young and everyone has their opinions on how things should be represented and accessed. Efforts are underway by organizations like SNIA, with their Cloud Data Management Interface (CDMI), to standardize both the resource structure and the representations. However, standards are not developed overnight and customers are demanding programmatic access to Cloud Storage now.

Current Cloud Storage vendors have produced a basic set of APIs that are accomplishing fairly similar things, and other APIs that expose the underlying unique functionality of the Cloud Storage platform supplying the storage cloud.  You should expect that, over time, most storage clouds will provide the basic functions in somewhat similar ways, and further that additional advanced functions will be adopted and expected to be in every storage cloud offering. 

Finally, you should look for a taxonomy of APIs, that includes basic file functions, advanced functions, Provisioning APIs, Billing APIs, and Management APIs.  Storage clouds that become successful will offer all these capabilities, to increase the efficiency of their use.

mezeoapi.gif

 
Several efforts have been made to simplify the transition between vendors by providing an abstraction layer on top of the vendor's APIs. In this approach, a program library is created, for use in the application that needs cloud storage access, and this API translates (for the given program language) a single API into the API that is specific to a Cloud Storage offering.  So, the application, which is using this library, writes their APIs once, and achieves portability between storage clouds that are supported by this approach.

This approach has been largely programming language specific and may take advantage of the language it was designed for. Good examples of this are jClouds, an open source cloud storage abstraction library written in Java, and Simple Cloud API, a collaboration of vendors including Microsoft, Rackspace, Nirvanix, IBM and Zend which provides a simplified Cloud Storage interface for PHP developers. While extremely useful for developers, these abstractions tend to expose the lowest common denominator relating to Cloud Storage functionality and may omit critical features, for example only providing namespace object access as opposed to ID access.

So, let's discuss lock-in, the term used to express concern that once a vendor has gotten you to exploit their architecture and technology, they will recognize that you are committed to them and cannot easily move away.  As a result, they will then raise their prices and take advantage of your lock in status, keeping their price just below the amount that would encourage conversion away from their technology and towards a more "open" set of capabilities.  Let's look at all the "dreaded" examples that have been surfaced around cloud storage and as a reason to slow it's adoption:

1.    API lock in, which means your interaction with a storage cloud uses the APIs of that storage cloud, and suggests that you cannot easily move to another providers cloud with their own, different APIs.

2.    Vendor lock in, which means that since you are condemned because of your application development activity with specific APIs to use only a cloud from a specific supplier.

3.    Device lock in, meaning that you developed a cloud storage based program utilizing the APIs of that specific cloud, for a specific device (generally a PDA) that has specific functionality.  This is double lock in, both the device programming methodology and the API selection.

4.    Browser lock in, meaning that programming to specific APIs can also be rendered unique based on the Web browser that is selected.

5.    Programming language lock in, which means that you have written the APIs in a language like Python, or JAVA, or .NET, or whatever.

6.    API wrapper lock in, which means that you incorporated libraries into your application that allows your application to write generic APIs, which are then translated by these APIs to the correct API for the desired storage cloud (this is what Simple Cloud API is).

So, as you can see here, utilizing cloud storage could ultimately have you locked in on at least six levels! 

With this much opportunity for vendor abuse, why are developers rushing to write Web based applications that utilize cloud storage services via API access?  Are they simply uncontrolled, unthinking rebels who will shortly learn the error of their ways?  Have they made a fatal error?  Or do they know something you don't?

First, learn about Cloud Storage APIs.  What they do is make storage programmable, and they abstract storage from the application.  They offer advanced functionality (the programmable word) that makes it faster and easier to write the applications that are scalable versus the traditional storage access approaches.  When you add these two capabilities to the storage cloud offering of low cost, availability in multiple locations, seamless provisioning, ease of adding additional storage, and the pay for use model, the case for the cloud has become compelling.

Where are we seeing early adoption:  at service providers, because they host Web based applications and SaaS (usually Web based) applications, and this is where the developers who recognize the opportunity are focused. 

What is coming: the introduction of this technology into the enterprise, complete with the adoption of the RESTful API technology.  This will ultimately lead to a level of cooperation between service providers and the enterprise that has long been predicted.  Enterprises will move to an IT modeled on an OPEX model, and expect their applications to be provisioned and interacting with service provider clouds, via APIs.  IT Service Providers are racing to build the clouds to provide for this emerging business opportunity.

So, what about the lock in mentioned above.  Sit down with your developer, they will show you why they don't feel "locked in".  They will show you that you can quickly recraft your current APIs, in the programming language of your choice, to utilize the new APIs of the desired cloud.  For this reason, Simple Cloud API will likely be a short term measure, which precedes base case APIs that are extremely similar, and goes through a market led process to identify "best practice" APIs for both base case and advanced function, as well as all the other API led capabilities as mentioned above.  In short, vendor lock in is not the problem for this technology that it has been for others.  Also, the ingenuity and resourcefulness of all the suppliers, standards groups, and market adoption scenarios will continue to mute your ability to be lock in free. 

Your real challenge is not lock -in, but rather how to adopt this new set of capabilities, and solve problems and create opportunities with your IT solutions as rapidly as possible.  Standing on the sidelines waiting for this one to resolve will keep you out of a great opportunity, because we still have several meaningful years of rapid change associated with this technology adoption cycle. 

My last post on REST generated some attention.  Since it is an important topic, I wanted to share some additional links for those who are trying to improve their understanding of REST:

- Stefan Tilkov's Intro to REST presentation and When is an API RESTful?
- Dare Obasanjo's Explaining REST to Damien Katz
- Paul Precod's Second Generation Web Services, REST and the Real World, SOAP, REST and Interoperability
- Tim Bray's The Sun Cloud and REST, as in Take It Easy
- More stuff from Roger Costello
- Ryan Tomayko's How I Explained REST to My Wife
- Roy T. Fielding's Dissertation: Chapter 5

REST reflects the architecture of the Web.
  One of its most important characteristics (and there are many) is that it is "stateless".  That means that a REST style command from a requestor to a responder has everything in it the responder needs to know in order to take an action.  No further handshaking is required.  Very efficient and Web like.  Since it is "stateless" it works very well with a "stateless" server architecture, in order to achieve Web scale.  In this way many clients can interact with many servers against a large pool of objects to accomplish many interactions, well, you get the point, Web scale.  That's one reason we use RESTful Web Services API commands to access the Mezeo Cloud Storage Platform servers, which are also stateless architectures, implemented via Linux.  Web scale, one of the requirements of cloud.

REST is also highly efficient, so that interactions between requestors and responders via a network can be done with a minimum of overhead.  If you ever download a 500 gigabyte file via a cable modem based internet connection, you will likely appreciate any efficiency that can be achieved.  Speaking of efficiency, REST also accommodates caching, at both the client and the server, which can dramatically improve the efficiency of your interactions with the "object" (an object could be, for example, a file, like a picture, or a pdf). 

Developers who utilize RESTful Web Services APIs to create applications appreciate the efficiency and capability of the APIs.  Expect, over time, to see more commonality among base case APIs and other APIs that expose storage cloud specific advanced services.  For example, Mezeo based clouds offer a secure share, collaboration, notifications, and nested files and folders, for example.  Some clouds may have such a unique set of APIs that others will create translators (wrappers, for the IT guys in the audience) for them, and we will continue to make headway on openness.

RESTful APIs are a critical part of new application development, and represent the delivery of Service Oriented Architecture infrastructure for storage.  Storage is now programmable.  And I bet you thought cloud storage was just a utility computing model applied to storage, for scalability and pay for use.  Both are necessary, but not sufficient for cloud storage.
Most of us in the Cloud Storage industry strongly believe that a key capability of a storage cloud is the REST style Web Services API.  Many of the most popular storage cloud services include or exclusively use REST, including SoftLayer's CloudLayer, Amazon S3, Nirvanix SDN and Rackspace Cloud Files.

Other access methods that are most often associated with Cloud Storage access include cifs, NFS and WebDAV,  NFS and cifs are not particularly usable via an Internet connection and therefore useless in public cloud offerings.  While WebDAV is very useful for an Internet connection, it is similarly limited, in that all three protocols support traditional file operations like store and retrieve, versus the robust set of services that Web Services APIs can deliver.

Amazon introduced S3 with REST style API access only.  Cloud Files from Rackspace also utilizes REST style APIs. Nirvanix SDN utilizes both REST and SOAP APIs.  Mezeo offers REST APIs.   Various groups are also engaging on the issue of what representations of REST should be common across cloud offerings.  The SNIA, (the Storage Networking Industry Association) has assembled a technical Cloud Storage working group for further refinement of REST style implementations for several purposes.

So, what is the purpose of the other, older access protocols?  When deployed with API based Cloud Storage offerings, they provide additional options for legacy applications to expose their objects (files) to the advanced services of the Cloud, and further make these files available to the new API based applications.

Why all the excitement about RESTful APIs?  Cloud Storage is more than a utility business model applied to traditional storage. It is storage that is accessed via Web Services APIs, over a network. Developers utilize these APIs because they are easy to use and they expose significant capabilities and services from the storage cloud, far beyond scalability, performance and pay for use.  As I have said before, scalability and pay for use are as much a business decision about how you sell storage, as they are a technology implementation of storage.  If there were no need for the API based services, the older and well used protocols would persevere.  This is clearly not the case.

I have carefully avoided the use of the word "standard" associated with the REST  style or architecture.  Here is an interesting view on that topic from Roger Costello:

REST is not a standard. You will not see the W3C putting out a REST specification. You will not see IBM or Microsoft or Sun selling a REST developer's toolkit. Why? Because REST is just an architectural style. You can't bottle up that style. You can only understand it, and design your Web services in that style. (Analogous to the client-server architectural style. There is no client-server standard.)

Cloud storage service providers understand that a new storage infrastructure has emerged, as an embodiment of Service Oriented Architecture, with a set of services that are delivered via APIs.  Scalability, performance and pay for use are attributes of traditional and cloud storage solutions, but Web services APIs are the distinguishing feature of cloud storage.  Accessing storage via Web services APIs represents a revolutionary change in storage, not a simple generational change. REST APIs are the embodiment of the way the Web works and are necessary to expose storage as a "storage cloud"!

What should you expect in relation to these API issues?

Most of us expect that over time, there will be a base set of specifications that are jointly developed within the marketplace and by various industry organization, resulting in a well accepted set of representations for REST style Web Services APIs.  At a panel at Hosting Con earlier this week, both Emil Seyegh of Rackspace and myself confirmed that when the industry gets further clarity on this specification, it will be relatively easy to introduce those APIs into our offerings, and that they can co exist with our current APIs.

REST is a topic that you will continue hearing more about.  You'll most certainly hear more about it from me in future posts.
The concept of Service Oriented Architecture (SOA) has been around for a long time, and some people believe it has not fulfilled its promise.  To the contrary, SOA is well on its way to fulfilling its promise and the rise of cloud computing infrastructure is an important step in this process.  In fact, cloud computing is already beginning to unleash the potential of SOA and much more is on the way.

David Linthicum, Editor-in-Chief of Sys-Con's Virtualization Journal, has it mostly right.  He says:

Let's get this straight: SOA is an architectural pattern, simply put the ability to create an architecture around the notion of many services that are bound together to create and re-create business solutions. Cloud computing is a set of enabling technologies as a potential target platform or technological approach for that architecture...One is the way of doing something, while the other is a potential outcome. SOA doesn't go away. It's not replaced. It's architecture. Cloud computing is a potential outcome of that architecture, thus cloud computing needs architecture, and vice versa.

David's rant was an argument against complaints by certain industry pundits that cloud computing is just an over-hyped reincarnation of SOA. 

I agree with David as far as he goes, but he can take his point further. He is correct to call SOA an architectural pattern.  He is correct to call cloud computing a "target platform."  But the real news in this story is that a target platform is exactly what SOA has been lacking all these years.  All applications must run somewhere; applications need infrastructure. 

SOA is an application architecture; cloud computing is an infrastructure architecture.  It's that simple.  This marriage is long overdue.

SOA applications inherently call upon Web services to request resources, so to run properly SOA applications need infrastructure architecture that lends itself SOA.  Cloud processing (dynamic allocation of CPU resources) and cloud storage (Web services API access to storage resources) infrastructure is the most natural target platform for SOA apps because cloud infrastructure is designed to scale in the way implied by the SOA approach to application architecture. 

Until recently, where could a SOA app find a venue to stretch its legs?  There weren't many options until the earliest cloud computing service providers deployed large-scale cloud infrastructure.  The SOA world owes Amazon and Rackspace a big thanks for making the infrastructure investment required to launch S3, EC2, CloudSites, CloudFiles, and CloudServers.  As the rest of the Hosting market--and broader IT service provider industry--follows suit, SOA applications will flourish.

So David, you're right.  Not only do cloud computing and SOA "need each other," but together they will ultimately justify all the hype.

Definition: Cloud Storage

| Comments | 1 TrackBack
Back in May of 2006, Amazon, introduced Simple Storage Service (S3) as file storage for their Elastic Compute Cloud (EC2) computing environment. Despite some  short-comings, the pricing flexibility and the web-scalability offered by S3 made it an instant hit with the software development community.

For the first time, a large pool of storage was available for use, with three significant attributes:  access via Web services APIs on a non persistent network connection, immediate availability of very large quantities of storage, and pay for what you use.

For many years, the Internet has been represented as a "cloud".  The term has now been extended to include the web scale computing capacity that is available via services offerings like Amazon EC2.  As a result the term "cloud computing" was coined, and includes files storage, referred to as "cloud storage".

As is the case with most nascent industries and technologies, there is no shortage of definitions.  Some are the result of vendors seeking ways to include their unique features in the discussion. Vendors offering online file sharing claim to have a cloud storage offering; others doing online backup stake their claim as cloud storage vendors; even companies offering enterprise clouds believe (at least in public) that they also have a cloud storage offering! This statement is very difficult to understand for some, and easy for others, depending on your definition.  We seem to have as many definitions for cloud storage as there are vendors and users.  However, it seems like the marketplace is beginning to coalesce around some basic requirements for a cloud storage definition.

In my opinion, notwithstanding the complexity of the technology, the concept of cloud storage is fairly simple and straightforward. Here is how I define it: cloud storage is storage accessed over a network (internal or external) via Web Services APIs.
 
What is interesting about this approach are the benefits it brings to the table.
By exposing storage through a Web Services API, cloud storage enables the application developer or user to connect to an abstracted layer of storage as opposed to the storage device directly. As you can imagine this simplifies integration and development, and facilitates the introduction of many desirable features and options.

Certain elements of cloud storage functionality drive the ability to rapidly scale the amount of storage available to any user.  Other capabilities enable a storage provider to bill for used storage versus allocated and available storage.  It is worth noting that these two commonly accepted features of cloud storage service offerings are as much associated with the service provider's business model as they are with some of the technological capabilities of cloud storage.   For this reason, we do not include these features in our definition.

Depending upon how it is deployed, cloud storage can be as simple as a place to store and retrieve files (thus the use of "simple" in the Amazon offering), but it can also be designed to provide advanced functionality. These advanced capabilities, available with the storage layer of the infrastructure, will ultimately differentiate cloud storage offerings and drive their consideration for use, but the specific features do not need to drive our definition.  It is important to note that these capabilities have a much more important role than simply differentiating vendor offerings, they will accelerate new applications, make mash ups easier, drive increased adoption of this approach to storage, and create new opportunities for the use of computing.  This is the classic, game changing result of innovation in our industry.  It is exciting, and timely.
 
Cloud Storage, like any other emerging technology, is experiencing growing pains. It is immature, it is fragmented and it lacks standardization. Vendors are promoting their particular technology as the emerging standard. While a standard doesn't exist yet, we are confident that one will emerge soon. We believe that a set of Web Services API based capabilities, accessed via non persistent connections on public and/or private networks, provides the fundamental frame of reference and definition for cloud storage.  The definition allows for both public service offerings and private (or enterprise) use, and provides a basis for expansion of solutions and offerings, versus a limitation.

Thanks for giving your consideration and input to our definition.

Sponsors

About this Archive

This page is an archive of recent entries in the Cloud Taxonomy category.

Cloud Storage Strategy is the previous category.

Collaboration is the next category.

Find recent content on the main index or look in the archives to find all content.