Recently in Cloud Ecosystem Category

Is OpenStack "Off the Rack"?

| Comments | TrackBacks
openstack.gifOn July 19, 2010, Rackspace led the announcement of OpenStack, with a goal of creating an open source cloud software solution for use on industry-standard hardware.  The initial releases contemplate solutions for both cloud compute and object storage.  While these are the first two releases, they are separate offerings.  Remember, cloud storage is not just the storage target for cloud computing, it is one potential storage target for cloud computing, and is in and of itself a stand alone cloud offering of programmable storage.

Now, I have purposely used a term from the clothing industry, "off the rack", to spend a moment looking at a framework for evaluating the opportunities this may present.  With dress shirts, you can buy off the rack, semi custom, or custom, each with a unique value proposition based on fit, choice and cost.   Interestingly enough, this may be a good lens through which to consider the possibilities of OpenStack, and in particular, OpenStack Object Storage.

Rackspace has made no secret of its motivations for leading this initiative, and its desire to focus on "fanatical" service as it's key differentiator versus the fundamental technology on which the service is based.  Fair enough, and so the question becomes, is the rapidly emerging and immature cloud marketplace already "mature" enough to seek homeostasis?  (Homeostasis is the property of a system, either open or closed, that regulates its internal environment and tends to maintain a stable, constant condition.)  Have enough models and innovations, from startups, academia, open source movements and large tech companies, been tested in the marketplace to the extent that we can already race to the common denominator?  Perhaps now is a good time to start, as long as you are willing to acknowledge that the desired results are a good ways off.

Before we jump off into "Off the Rack" software, a quick look back at open source is helpful.  For more reading on the open source software industry a good introduction is The Cathedral and the Bazaar. Six things are particularly interesting: 

  1. An open source alternative can emerge as a follow on to a successful commercial technology and can become pervasive versus the commercial offerings it succeeded (LINUX versus UNIX is the reference case here).
  2. A second result of this approach can also end up with a big success, although in more of a niche than a pervasive replace for the earlier commercial offerings (MySQL versus Oracle, IBM and Microsoft in the relational data base space).  
  3. An open source effort can also emerge earlier in a technology cycle and come of age as a pervasive solution (Apache Web Server comes to mind here).
  4. Open source generally requires very careful cultivation of the community of developers, with active interest by academia (and partnering with NASA is part of the formula here).  Commercially sponsored open source efforts are becoming more common, although it as of yet has not been proven as the typical "breeding ground" for most great open source successes.  Eucalyptus, with its roots at University of California Santa Barbara, seems to be a more traditional route.
  5. Open source is not necessarily reflective of rapid commercial opportunities for success.  Eucalyptus is obviously beginning to maneuver towards a repeat of the commercialization model.  OpenStack is taking the approach most favored by other open source successes like Apache.  A couple of good reads here are this article from BusinessWeek and this. See also Derrick Harris' post over at GigaOm.
  6. There are also hundreds of thousands of open source projects that had mixed success or languished altogether. A quick look at  SourceForge (an open source project hosting site) shows nearly a quarter million hosted projects. How many of these have languished or had little impact on the market.
So, the first issue is that there will exist for some time to come a real question as to the adoption potential of OpenStack.   I believe that adoption is driven by applicability to need.  In a moment we will address a serious issue which OpenStack Object Storage must overcome to be successful, at best, and at worst, will confine it to a niche market.  My views are very much directed at the Object Storage offering, versus the compute offering, which I believe exists in a different space and as a different type of solution.  With this backdrop, let's have a look at the cloud storage marketplace today, and use the analogy of off the rack, semi custom and custom:

  • Off the Rack:  implement as is, one size fits all, each with unique approaches for performance, scalability, bit integrity, may or may not provide geo services.
  • Semi Custom:  Select from storage types (DAS, SAN, NAS, JBOD), shared or distributed file systems and object systems, mix and match storage for different SLA and cost/usage patterns on the same infrastructure, multiple APIs, meta data and catalog abstracted from storage layer, geo services.
  • Custom:  Generally a service only offering and not available as deployable infrastructure, specifics will vary widely based on service provider offering strategy.

Infrastructure

Type

Comments

Eucalyptus

Off the Rack

Limited S3 APIs

OpenStack

Off the Rack

CloudFiles APIs

Scality

Off the Rack

S3 APIs

Mezeo

Semi Custom

Mezeo ReST APIs and S3 APIs

NetApp

Off the Rack

Bycast APIs, NetApp storage

EMC Atmos

Off the Rack

Atmos ReST APIs, EMC storage

Service

Type

Comments

Amazon S3

Custom

S3 APIs

Microsoft Azure

Custom

Windows centric

Rackspace

Off the Rack

Is the basis for OpenStack

Nirvanix

Custom

SOAP APIs, multi node

Google

Custom

Offers S3 APIs

AT&T Synaptic

Off the Rack

Based on EMC Atmos

OpSource, SoftLayer, Layered Tech and others

Custom

Based on Mezeo

As you can see from the summary above, there exist as many views of what constitutes either a cloud storage service or a desirable cloud storage deployable infrastructure as there are service providers and vendors.  Note that a semi custom infrastructure results in a "custom" service as implemented.  "Off the rack" results in very similar services by those who utilize the same infrastructure unless they make their own major additions.  Any offering can be differentiated by service, and the degree and quality of service is critical to customer satisfaction and plays a strong role in value creation.

The OpenStack announcement as it regards Object Store and its approach to cloud storage seems to view cloud storage infrastructure as highly akin to an operating system (or at least a "hypervisor") and more similar to a selection of LINUX or Windows than that of an application or middleware layer.  While I agree that cloud compute is very close to this model, cloud storage is a service oriented architecture, with programmability for new applications that can tolerate Internet latency because of Web Services (like ReST APIs). The industry constantly overlooks this key point as it is consumed with the low cost, pay for use and thin provisioning capabilities of this storage tier.  Solutions for thin provisioning and low cost have been available far longer than cloud storage. Further, pay for use is more of a business decision than a technology. 

In the earliest days of cloud storage, there existed initial confusion that cloud storage was defined by cost, scalability, pay for use, and thin provisioning only and not programmable access (usually via ReST APIs).  ParaScale paid a huge price for not understanding that cloud storage requires Web services (like ReST API) access.  Now, with OpenStack Object Store, we see a follow on case of this same perspective, but with basic APIs for Put, Get and List.   Yes, it provides for Internet access via ReST APIs, but the focus continues to be primarily cost based versus new application enablement based.  It could be argued that the open source approach will provide for the appropriate additions of "advanced services" to be added.  However, even the use of the platform by NASA is more focused on cost of storage than on advanced functionality because NASA stores much more data than almost any institution or enterprise in the world.

I think Savio Rodrigues states this view very well in his post:

"Select products based on business needs, not license alone: It's also interesting to note that very few enterprises are in NASA's position with regards to size of IT investment and skills in-house. While NASA engineers were ready and willing to contribute new features into the Eucalyptus open source community, few companies have the skills or governance to consider allowing their developers to contribute to open source projects.  Summary trend number 7 from the 2010 Eclipse survey results highlighted this issue.

To suggest that NASA's buying or IT decision making patterns represents much more than the top 1 percent of IT buyers would be a stretch."

The overwhelming majority of enterprises would rather pay a vendor to deliver, maintain, support and enhance their private cloud software infrastructure than place that burden on internal IT staff. Whether the enterprise is paying for a closed source commercial product, a commercial product based on an open core product, or a subscription to an open source product, the product selection decision will be made based on business requirements much broader than 'is the product open source or not?' "

Keep in mind that cloud storage is a stand alone service associated with application delivery over the Internet and also associated with low cost, pay for use, scalable storage resources.  Social media applications and many Web based applications exploit these capabilities; for example publishing a file to a URL and significant tagging of files.

This view of cloud storage as nothing more than cost and volume-based ignores its extraordinary importance as a service-oriented architecture for new application enablement.  I believe both views are equally important and need to be equally served.  Will OpenStack, with its pervasive cost focus, be able to drive its community to this additional view of needed contributions of advanced services for cloud storage?  Lydia Leong of Gartner Group provides an interesting view of the open source community issues associated with this in her post:

"At the same time, open sourcing is not necessarily a way to software success. Rackspace has a whole host of new challenges that it will have to meet. First, it must ensure that the roadmap of the new project aligns sufficiently with its own needs, since it has decided that it will use the project's public codebase for its own service. Second, it now has to manage and just as importantly, lead, an open-source community, getting useful commits from outside contributors and managing the commit process. (Rackspace and NASA have formed a board for governance of the project, on which they have multiple seats but are in the minority.) Third, as with all such things, there are potential code-quality issues, the impact of which become significantly magnified when running operations at massive scale."

One last comment on this business of vendor lock in and cloud storage APIs (another focus of the OpenStack announcement).  I would submit that while a specific set of APIs has the potential to create vendor lock in, this is a much smaller problem than what is experienced in other technologies.  If you are really worried about it, you probably have never actually written a ReST API call.  It is written in many languages, and we have seen cases where applications that run on S3 run unchanged on Mezeo.  Others need very minor modifications, and still others are excited to take advantage of some of the unique Mezeo services.  It just is not a problem, and this is much more related to FUD (fear, uncertainty and doubt) and marketing zealotry than it is associated with technological reality.  The APIs of choice will shake out, and it is far to early to say if it will be S3, OpenStack, CDMI or a combination of all of these, and others, as yet unforeseen.  (At Mezeo, we have never believed there will be one winner, and instead focused on architecture to enable easy and effective delivery of whichever APIs stand the test of time.)

The interesting view that seems to be missing here is that marketplace competition by service providers already serves to drive down the price of cloud storage, so
a commoditized stack embraced by most is unlikely to yield extraordinary incremental savings.  At the same time, while the competitive market conspires to drive cloud storage costs ever lower, the need to differentiate, and deliver solutions as well as a programmable storage to enable multiple new and exciting types of applications will rapidly replace the pure cost and scale focus of current cloud storage offerings.  Sometimes, the "new" application is simply enabling it in the cloud, to produce the same result at a lower cost!  This requires significant cloud storage functionality in order to make this easy and productive.  Amazon continues to prove this with their many additions and capabilities which differentiate their service.  Mezeo sees much the same view on the part of our customers.  The focus is on what cloud storage can do, what problems will it solve, what business opportunities does it create, what new applications can it enable and all of these views assume it will be competitively priced.

Cloud storage represents significant opportunities for institutions, the enterprise (see my recent post on the business case for enterprise cloud storage) and for the IT service provider.  Cloud storage is substantially different from cloud compute, and requires that you understand this difference in order to effectively evaluate the impact of this announcement, as well as your next steps.
Cloud storage is already showing signs of Phase Two (see our post on the cloud storage maturity model), as a new set of solutions arrive in the marketplace.  These solutions are referred to as cloud gateways, on ramps, cloud clients, edge devices and other exotic names. 

For ease of discussion, lets use "cloud client" to describe a solution that is on a single user device (workstation, PDA, Tablet) and "cloud gateway" or just "gateway" for a solution that is delivered on a server or router for many users.  Whether they are a client or a gateway, some store a "blob" of data, and some store "chunks" of data that are parts of the original object.  Others store the actual object.  What's the difference and is it important? Should you consider it in your cloud gateway use plans?

What is a blob?  A blob can start as either a single object or a collection of objects, for example, all of the files on a single server, or a VM image.  Then, you do something to it in the client/gateway device that requires it to be brought back through the original client/gateway to be returned to a useful state.  Examples include de-duplication and compression followed by encryption prior to transmission of the object to the cloud (I call this D/C/E).  The result is a "blob" of data, an object that is minimized in size, and must be retrieved by the application that created it in order to be useful again. 

A chunk is part of an object, and the original object must be re-assembled by the gateway that parsed it in the first place. Some gateways store blobs.  Some store the object in chunks.  Finally, some store the actual object with its original file type, intact.  These may be workstation clients, or interface solutions that allow for a CIFS or iSCSI (today, TwinStrata is an example of the iSCSI capability) attached device to store in the cloud.  There are trade-offs and advantages associated with each approach, and your cloud storage use case and objective must be carefully analyzed in order to determine the applicability of the gateway to your business requirement.

Now, let's consider D/C/E.  This provides savings in addition to the savings associated with cloud storage.  D and C gives you a small object size, so your bandwidth cost is lower, and your overall storage cost is lower.  When there is a change to the stored objects, chunks allow you to send only the changed part of the object, reducing bandwidth and potentially improving performance.  Encrypting, or chunking, or both, may improve security and relieve you of the costs and management associated with other security approaches.

So, blobs and chunks sound pretty good, providing better security and lower costs.  What's the catch?  First, storage clouds are great places to provide anytime and anywhere access to your data, from multiple devices.  If you have to go back to a gateway to get the original version of the object, that flexibility may be very limited or non-existent.  Clouds are also a great place for sharing and collaboration, which is not in play if the object in the cloud is not in a useful form.  Finally, vendors are not giving gateway solutions away - we must ask what they cost, and are they worth it?

As usual, the answer is, it depends.  What services can I get from the cloud? And what services can I get from the gateway?

An example that is getting a lot of attention is file server replacement, or even better, file server displacement.  I get less excited about replacing a file server with another server that is a policy driven cache, because I still have this layer of technology in place.  However, if you can displace most of your file servers, then the potential for significant cost savings become obvious.  

I tend to look at single user clients as very interesting on ramps to the cloud.  A client, using some modest amount of workstation storage as a cache, can deliver most of the benefits of a file server.  Companies like Gladinet, SMEStorage, GoodReader, Mezeo and others have very interesting cloud clients.  You will still need a few file servers if you need to provide a place for very large files.  Interestingly enough, those very large files are often rich media (like training videos), and streaming them to a reader on the client from the cloud is often good enough.  Another cloud client capability we expect to see will allow the end-user to store files and move them across multiple storage providers - from private to public and vice-versa, for example.  This functionality could also be in a server-based gateway.

Another cloud client capability might include giving encryption capability to the end user, and let them decide if they want to encrypt the file themselves.   Or, use a cloud that provides user selectable encryption.  Give your end users or customers the power of choice, the freedom of access anytime and anywhere, the ability to get the amount of storage they need when they need it (what Gartner calls "reservationless", and kudos for them, great term).  Don't tie users to a "home base" gateway that does not store their object in it's original format, or at least give them a choice.  All that being said, we are seeing that some mix of clients for file server displacement, and file server replacement gateways may ultimately be the appropriate solution.  

Backup and archive is a different story, and here a gateway can make a lot of sense.  First, there is quite a bit of local housekeeping associated with these solutions, and the solution can decide if utilizing the cloud for some or all of the files makes sense. Speed of restore is a major consideration for a backup, and may drive local versus cloud based storage solutions.  Further, the need for a disaster recovery site, or to archive, can often be a cloud use case.  Companies like Zmanda and CommVault are very active in cloud based backup solutions.  What if you have applications that do not speak REST APIs, like a legacy backup solution?  There are gateways that can attach these legacy applications to the cloud, for example, TwinStrata.

Special purpose gateways can also solve an immediate problem.  Blue Thread offers a cloud storage interface for SharePoint.  The marketplace is rapidly developing a portfolio of cloud storage gateways and clients, as well as backup and archive solutions and all have their own unique perspective on cloud use.  Examples include StorSimple, Cirtas, Gladinet (who also makes clients), and EntropySoft.  Venture capital companies are deploying significant capital for these sorts of solutions.  Each of these solution providers sees a clear path to adding significant value to cloud storage solution delivery.

Cloud storage requires significant use case consideration to evaluate the functionality required, both in the cloud and in the gateway or client, and where the application or user can best exploit the functionality.  After all, cloud storage is also about empowering the end user with the storage they need, when they need it, at a favorable price, and providing advanced functionality, like publishing and sharing.

At Mezeo, we have both a deployable cloud infrastructure, and clients.  That causes us to look at where the best place to put the functionality is.  That creates a slightly different perspective, and we think it creates very useful products.  On the other hand, nothing gets us more excited than the thought of more solutions that drive cloud storage adoption and usefulness.  For this reason, we are rolling out a new marketing and certification program, Mezeo Ready

With Mezeo Ready™, service provider public storage clouds can easily identify their offering as being "Ready" for use by Mezeo Ready clients or gateways, and backup and archive solutions.  Users of these products can pick one of many trusted service providers hosting Mezeo Ready cloud storage solutions.  This cloud storage on ramp and cloud storage provider "ecosystem" ultimately delivers valuable solutions to customers and is a big part of Mezeo's vision for the cloud storage market.

So, more to come on Mezeo Ready, we are nearing the official announcement of the program, and will extend it to storage providers and file system providers who work with Mezeo to deliver storage clouds, both private and public.  Other solutions, like billing and provisioning systems will also be in the Mezeo Ready™ program.  The changes the cloud is delivering are new and useful, and deliver real value to the institutions and businesses that are embracing them.  The ecosystem is critical to the value delivery chain, and key to providing unique, desirable solutions.

Here's an interesting read on some of the issues that traditional file systems face which can now be overcome with an object-based system. 

According to the author, Beth Pariseau:

Unstructured data is expected to far outpace the growth of structured data over the next three years. According to the "IDC Enterprise Disk Storage Consumption Model" report released last fall, while transactional data is projected to grow at a compound annual growth rate (CAGR) of 21.8%, it's far outpaced by a 61.7% CAGR predicted for unstructured data.

This is a di
rect result of the digital content explosion. 

Robin Harris, senior analyst at StorageMojo observes:

"There are going to be extreme amounts of data as things like digital video and mobile networks grow; in five years, pretty much every phone will be 'smart,'...All of us storage geeks agree on that, and different people are beginning to visualize what that kind of growth needs in terms of storage infrastructure."
The article makes the case to "Think APIs, not files."  In essence, the point is as follows (as explained by Harris):

"File systems make less sense over time as the amount of data grows. Architecturally, it makes more sense for each file to have a unique 128-bit ID and use an Internet-like system for locating that file; a URL points to an address and there are files at that address, and object-based storage interfaces are essentially operating on the same principle."
The result, writes Pariseau, is that "with an object ID replacing a file name, more extensive data can accompany an object than the simple 'created,' 'modified' or 'saved on' fields available in traditional file systems. Thus, detailed policies can be applied to objects for more efficient and automated management. Without NFS or CIFS to serve up files to applications, object-based storage systems need to replace that layer of abstraction between raw blocks of data on disk and files that applications can recognize. Today's object-based systems use standard APIs such as Representational State Transfer (REST) and Simple Object Access Protocol (SOAP), or proprietary APIs to tell applications how to store and retrieve object IDs.

One of our key decisions when we designed Mezeo was the adoption of object-based architecture for cloud storage.  Mezeo can use traditional file systems as object based systems to deliver cloud storage, and can also expose cloud storage as a traditional file system (even though it has objects underneath the covers, or as an object system).  This reflects our view that there will be a prolonged period of co-existence followed by a migration to object based systems.

If you'd like to learn more about how Mezeo offers an agnostic storage services platform for storage service providers (SSP), take a look at this paper (registration required) by the same Robin Harris: Building a scalable shared file infrastructure. The paper gives service providers an introduction to:

  • Cloud storage applications and customer drivers
  • Mezeo's storage architecture and options
  • Basic shared file storage reference designs
In the paper, Harris says that there are multiple ways to build highly scalable storage for cloud storage applications. He tells us how SSPs can differentiate their offerings:

The Mezeo platform allows the special features of the storage to be delivered to customers, while giving SSPs a powerful platform on which to build a business. Understanding what storage choices will better meet target market needs is a critical success factor. SSPs can differentiate their cloud services by careful selection of back end storage systems. The Mezeo platform gives SSPs great flexibility. Understanding how to use that flexibility will be key to growing a successful cloud storage service business.
Harris also presents five reference configurations (see diagrams below) in the paper, which vary in performance, availability, scalability, self-management and, of course, cost.

CONFIGU
RATION # 1: NEXENTA

config1_nexenta.gif



CONFIGU
RATION # 2: PERMABIT


config2_permabit.gif

CONFIGURATION # 3: PARASCALE

config3_parascale.gif


CONFIGURATION # 4: Red Hat Enterprise Linux

config4_Red-Hat-Enterprise-.gif


CONFIGURATION # 5: NetApp

config5_NetApp.gif


DOWNLOAD:

Robin Harris' Building a scalable shared file infrastructure >>
A lot has been written about the reluctance of many to use the cloud for their mission critical applications, and in particular, the enterprise.  While this may be a popular topic from the perspective of many, the Cloud is most certainly seeing a significant increase in adoption as more and more companies build their SaaS offerings on platforms from Amazon, Google, Force.com and Microsoft.

Platform as a service (PaaS) is defined in Wikipedia as "the delivery of a computing platform and solution stack as a service. It facilitates deployment of applications without the cost and complexity of buying and managing the underlying hardware and software layers, providing all of the facilities required to support the complete life cycle of building and delivering web applications and services entirely available from the Internet--with no software downloads or installation for developers, IT managers or end-users. It's also known as cloudware."

In gene
ral, PaaS offerings include workflow facilities for application design, application development, testing, deployment and hosting as well as application services such as team collaboration, web service integration and marshalling, database integration, security, scalability, storage, persistence, state management, application versioning, application instrumentation and developer community facilitation. These services are provisioned as an integrated solution over the web.

We just saw another Cloud validation as three established ISVs announced offerings on platfoms from PaaS providers. 
Both BMC Software and CA announced their intent to offer apps built on Force.com next year. Quest Software also announced the launch of its first set of Software as a Service (SaaS) Windows management solutions on Microsoft Azure.

Note also the following examples of SaaS services built on AWS, Google AppEngine and Force.com.  This "explosion of ent
repreneurship"  further the case that platform-as-a-service is rapidly gaining acceptance in the market.

cloudups.gif

What we are witnessing is a boom in platform-based businesses, made possible by the cloud model: pay-per-use, instant scalability, and the elimination of up-front capex costs.
As the industry announcements on Cloud Storage APIs keep coming, the confusion surrounding what they mean keeps growing.

We have the Amazon S3 APIs, Eucalyptus APIs, Rackspace Cloud Files APIs, Mezeo APIs, Nivanix APIs, Simple Cloud API, along with the standards proposed by the Storage Networking Industry Association (SNIA) Cloud Storage Technical Work Group, and more. 

So what should you do or think about all this? What impact do these Cloud Storage APIs have on your decision-making? Just how important are they, and what's next?

Here's some information to aid your understanding of this emerging and important technology.  Let's begin by answering two basic questions: 

What is a Cloud Storage Application Programming Interface (API)?
    
A Cloud Storage Application Programming Interface (API) a method for access to and utilization of a cloud storage system.  The most common of these are REST (REpresentational State Transfer) although there are others, which are based on SOAP (Simple Object Access Protocol).  All of these are associated with establishing requests for service via the Internet. 

What is REST? 
REST is a concept introduced in the doctoral dissertation of Roy Fielding, and is widely recognized as an approach to "quality" scalable API design.  The actual API design and capabilities are very dependent on the actual capabilities of the underlying Cloud Storage System

One of the most important REST capabilities is that it is a "stateless" architecture.  This means that everything needed to complete the request to the storage cloud is contained in the request, so that a session between the requestor and the storage cloud is not required.  Why is this important?  The Internet is highly latent (it has an unpredictable response time and it is generally not particularly fast (when compared to a local area network (lan)).  Once you get a request, there is no guarantee that you can ask a "qualifying question" of the requestor in a reasonable time period.  So, REST is an approach that has very high affinity to the way the Internet works.  Traditional file storage access methods that use NFS (network files system) or CIFS (Common Internet File System) do not work over the Internet, because of latency.

One other thing we should clear up:  Cloud Storage is for files, which some refer to as objects, and others call unstructured data.  Think about the "files" stored on your PC, like pictures, spreadsheets and documents.  These have an extraordinary variability, thus "unstructured".  The other kind of data is "block" or "structured" data.  Think data base data, data that feeds transactional system that require a certain "guaranteed" or low-latency performance.  Cloud Storage is not for this use case.  IDC estimates that approximately 70% of the machine stored data in the world is unstructured, and this is also the fastest growing data type.

So, Cloud Storage is storage for files that is easily accessed via the Internet.  This does not mean you cannot access Cloud Storage on a private network or LAN, which may also provide access to a storage cloud by other approaches, like NFS or CIFS.  It does mean that the primary and preferred access is by a REST API.  (Here are other terms you will see, RESTful, or RESTlike or RESTstyle, which is geekspeak for how closely the API conforms to the REST approach.) 

Today, there are multiple definitions for Cloud Storage, and the one I prefer is "File Storage accessed through Web Services API's over a network".  This represents the key attributes of file storage that is cloud storage, versus other types of file storage.  Other key qualities of a storage cloud are:

  • multi-tenant support (use by more than one unrelated user)
  • geo location and geo replication, seamless and real time provisioning of accounts
  • seamless and real time provisioning of accounts
  • availability of "practically" unlimited amounts of storage "on-demand"
  • "pay for use", which means that your payment is for actual storage used, over some time frame, usually a month. 

There are many who are still arguing about what I have defined above, but what I've said is generally accepted by the industry.  If it is a vendor doing the arguing I would suggest you check under their hood, usually you will find that they do not offer whichever of the above features they are trying to argue out of the definition.

Also, traditional storage vendors continue to proclaim the importance of local network access (like NFS, CIFS or ISCSI) for the purpose of Cloud Storage access by applications that today can only access via the older protocols.   This requires that the application making the request be on the same local network (think same data center) as the storage cloud.  Their reason for this view is that they are only just beginning to see application demand for storage cloud access via REST APIs, versus their traditional business model which serves an enterprise user with their own data center. 

This is why Cloud Storage has generally emerged as a service offering in the IT Service Provider  (also know as the WEB Hosting Industry) space first.  In this space, there is no doubting the importance and future of REST API access to storage clouds, it is only viewed as an adoption speed issue.  Note that within the data center, access to storage using an HTTP based protocol is not necessarily any slower than one of the more traditional protocols. API access has been labeled as being a slower form of access over NFS and CIFS. This view is largely due to the fact that it "may" be accessed over the Internet. In most cases, it is the network that adds the latency, not the means of access. Make no mistake, traditional storage vendors see this coming, and they will make offerings available in the near future.

REST APIs are language neutral and therefore can be leveraged, very easily, by developers using any development language they choose. Resources within the system may be acted on through a URL. So, an API is not a "programming language" it is the way a programming language is used to access a storage cloud.  This is part of the basic understanding of APIs that is required to discuss the dreaded "vendor lock in" and upcoming "cloud lock in" discussions and understand the issues that surround these assertions.

REST APIs are also about changing the state of resource through representations of those resources. They are not about calling web service methods in a functional sense. The key differences between different Cloud Storage APIs are the URLs defining the resources and the format of the representations.
 
The Cloud Storage space is very young and everyone has their opinions on how things should be represented and accessed. Efforts are underway by organizations like SNIA, with their Cloud Data Management Interface (CDMI), to standardize both the resource structure and the representations. However, standards are not developed overnight and customers are demanding programmatic access to Cloud Storage now.

Current Cloud Storage vendors have produced a basic set of APIs that are accomplishing fairly similar things, and other APIs that expose the underlying unique functionality of the Cloud Storage platform supplying the storage cloud.  You should expect that, over time, most storage clouds will provide the basic functions in somewhat similar ways, and further that additional advanced functions will be adopted and expected to be in every storage cloud offering. 

Finally, you should look for a taxonomy of APIs, that includes basic file functions, advanced functions, Provisioning APIs, Billing APIs, and Management APIs.  Storage clouds that become successful will offer all these capabilities, to increase the efficiency of their use.

mezeoapi.gif

 
Several efforts have been made to simplify the transition between vendors by providing an abstraction layer on top of the vendor's APIs. In this approach, a program library is created, for use in the application that needs cloud storage access, and this API translates (for the given program language) a single API into the API that is specific to a Cloud Storage offering.  So, the application, which is using this library, writes their APIs once, and achieves portability between storage clouds that are supported by this approach.

This approach has been largely programming language specific and may take advantage of the language it was designed for. Good examples of this are jClouds, an open source cloud storage abstraction library written in Java, and Simple Cloud API, a collaboration of vendors including Microsoft, Rackspace, Nirvanix, IBM and Zend which provides a simplified Cloud Storage interface for PHP developers. While extremely useful for developers, these abstractions tend to expose the lowest common denominator relating to Cloud Storage functionality and may omit critical features, for example only providing namespace object access as opposed to ID access.

So, let's discuss lock-in, the term used to express concern that once a vendor has gotten you to exploit their architecture and technology, they will recognize that you are committed to them and cannot easily move away.  As a result, they will then raise their prices and take advantage of your lock in status, keeping their price just below the amount that would encourage conversion away from their technology and towards a more "open" set of capabilities.  Let's look at all the "dreaded" examples that have been surfaced around cloud storage and as a reason to slow it's adoption:

1.    API lock in, which means your interaction with a storage cloud uses the APIs of that storage cloud, and suggests that you cannot easily move to another providers cloud with their own, different APIs.

2.    Vendor lock in, which means that since you are condemned because of your application development activity with specific APIs to use only a cloud from a specific supplier.

3.    Device lock in, meaning that you developed a cloud storage based program utilizing the APIs of that specific cloud, for a specific device (generally a PDA) that has specific functionality.  This is double lock in, both the device programming methodology and the API selection.

4.    Browser lock in, meaning that programming to specific APIs can also be rendered unique based on the Web browser that is selected.

5.    Programming language lock in, which means that you have written the APIs in a language like Python, or JAVA, or .NET, or whatever.

6.    API wrapper lock in, which means that you incorporated libraries into your application that allows your application to write generic APIs, which are then translated by these APIs to the correct API for the desired storage cloud (this is what Simple Cloud API is).

So, as you can see here, utilizing cloud storage could ultimately have you locked in on at least six levels! 

With this much opportunity for vendor abuse, why are developers rushing to write Web based applications that utilize cloud storage services via API access?  Are they simply uncontrolled, unthinking rebels who will shortly learn the error of their ways?  Have they made a fatal error?  Or do they know something you don't?

First, learn about Cloud Storage APIs.  What they do is make storage programmable, and they abstract storage from the application.  They offer advanced functionality (the programmable word) that makes it faster and easier to write the applications that are scalable versus the traditional storage access approaches.  When you add these two capabilities to the storage cloud offering of low cost, availability in multiple locations, seamless provisioning, ease of adding additional storage, and the pay for use model, the case for the cloud has become compelling.

Where are we seeing early adoption:  at service providers, because they host Web based applications and SaaS (usually Web based) applications, and this is where the developers who recognize the opportunity are focused. 

What is coming: the introduction of this technology into the enterprise, complete with the adoption of the RESTful API technology.  This will ultimately lead to a level of cooperation between service providers and the enterprise that has long been predicted.  Enterprises will move to an IT modeled on an OPEX model, and expect their applications to be provisioned and interacting with service provider clouds, via APIs.  IT Service Providers are racing to build the clouds to provide for this emerging business opportunity.

So, what about the lock in mentioned above.  Sit down with your developer, they will show you why they don't feel "locked in".  They will show you that you can quickly recraft your current APIs, in the programming language of your choice, to utilize the new APIs of the desired cloud.  For this reason, Simple Cloud API will likely be a short term measure, which precedes base case APIs that are extremely similar, and goes through a market led process to identify "best practice" APIs for both base case and advanced function, as well as all the other API led capabilities as mentioned above.  In short, vendor lock in is not the problem for this technology that it has been for others.  Also, the ingenuity and resourcefulness of all the suppliers, standards groups, and market adoption scenarios will continue to mute your ability to be lock in free. 

Your real challenge is not lock -in, but rather how to adopt this new set of capabilities, and solve problems and create opportunities with your IT solutions as rapidly as possible.  Standing on the sidelines waiting for this one to resolve will keep you out of a great opportunity, because we still have several meaningful years of rapid change associated with this technology adoption cycle. 

http://www.box.net/shared/static/8b3yuirobg.jpg

The announcement that Salesforce is integrating directly with cloud-storage Box.net is the tip of the iceberg when it comes to the future of the cloud:

Techcrunch explains what Box.net is thinking:

CEO Aaron Levie says that this is the first step in Box.net's plan to give businesses a secure way to share their files across multiple services on the web. He says that many of the cloud services geared toward the enterprise don't work well together -- oftentimes you'll have to reupload the same content to multiple sites to share or edit it. Box.net wants to help unify these services by serving as the central hub for your uploaded files, which you can then access from these other web-based services. Levie hints that we'll be seeing more integrations with other services in the near future.

What we are witnessing is the future of enterprise IT infrastructure. We have been talking about programmatic access through RESTful APIs for some time now.  This move by Saleforce is an evolutionary step in how enterprise IT will manage its IT infrastructure - it will be a cross-cloud platform, with applications and open access to the storage cloud of your choice.

Security is not an issue, and the future is about cross-cloud collaboration.

Phil Wainewright says that Box.net wants to be the "Switzerland of Data" - he's right and wrong.  Cloud Storage, provided by the various service providers are going to be the "switzerland of data storage."  Vendor lock-in is going by the wayside.

ReadWrite is spot on when they say that "you can start to see how platforms will evolve into service networks - where enterprise users may subscribe and get access to applications that they pay for on a per use basis."

The biggest threat then, is to traditional software vendors, and applications like Sharepoint.  We will see heated debates on this very topic in the days and weeks ahead.
I'm continually surprised by how often I'm asked this set of questions:

  • Won't cloud computing kill the hosting industry?
  •  Don't Amazon Web Services, Google and Microsoft Azure pose a huge threat to hosters? 
The dreaded word "commoditization" often gets inserted, in an apparent attempt to convey impending doom.  And most people go on to ask if the adoption of SaaS delivery models for application software will cause customers to "bypass" co-location and hosting altogether as they subscribe to all their IT needs via SaaS providers (such as Google and Microsoft, for example).

These questions are not particularly bad.  In fact, it's plausible (but not likely) that the IT infrastructure world could evolve in this way.  What's surprising to me is the degree to which people are naturally inclined to buy into this view of the future, versus the contrarian and much more likely position that the onset of cloud computing will bolster the growth and good fortune of the hosting industry.

OK, I understand it won't be a bed of roses for hosters, particularly during the more turbulent phases of this transition.  And I know the road to cloud computing riches will be a rugged trail, likely littered with at least a few casualties.  But the idea that we will quickly shift into a winner-take-all scenario with only a few large providers of cloud computing infrastructure, and no room for anyone else to survive and thrive, overlooks a number of considerations that will play a prominent role in the next phase of growth in the hosting industry.

Will Amazon EC2 and S3, and Rackspace Cloudsites and Cloudfiles, take business away from traditional hosters?  Sure.  But this is not a zero-sum game. 

Few people focus on the prospect that the overall hosting pie might grow faster than the rate of cannibalization.  I think it will. Hosting is simply the future of IT infrastructure outsourcing, and cloud computing is the future of hosting.

How big is the IT outsourcing industry?  Gartner and IDC measure the industry in the hundreds of billions of dollars.  How big is the hosting industry?  Tier 1 Research measures it in the single-digit billions, orders of magnitude smaller than traditional outsourcing.  This means that, even in 2009, the vast majority of businesses are managing their IT infrastructure as they have in the past: on premise, in aging data centers (or, even worse, the server closet), with non-scalable non-automated support models, and without benefit of the economies of scale that a hosting provider can offer.  So before we assume Amazon will snuff out the hosting industry, shouldn't we first assume that a materially greater percentage of the business market will elect to move IT infrastructure "into the cloud" in the first place?  If so, then we must assume the hosting pie will continue growing at the expense of traditional IT infrastructure outsourcing.  And there is a lot of room to grow.

messydatacenter.jpg

Regarding SaaS, clearly the SaaS model is here to stay.  But this doesn't mean hosting and co-location will be bypassed.  To the contrary, regardless of where application software runs (on the customer premise or in the service provider's data center), it has to run on IT infrastructure.  Managing IT infrastructure is complex and challenging, particularly in a multi-tenant service provider model.  Some SaaS companies will choose to take on this challenge themselves, and some of those SaaS companies might be successful with this strategy.  But it will be more common for SaaS providers to outsource the management of the IT infrastructure so that they can focus on their application software and customer service.  Companies like Google who bring both large-scale IT infrastructure as well as leading application software to the party will be the exception, not the rule.

As for cloud computing ... cloud processing solutions like Amazon EC2 and cloud storage solutions like Amazon S3 have kick-started the next generation of products that will be delivered by IT service providers, much in the same way Exodus and Digex gave birth to the co-location and managed hosting industries more than a decade ago.  The fact that Amazon was the first entrant to the cloud computing service provider market doesn't suggest everyone else should go home.  There will be abundant opportunities for service providers - especially hosters - to differentiate their offerings.  This industry is far from commoditized.

For example, one of the most compelling applications for cloud computing infrastructure is in the field of disaster recovery.  As Forrester's Stephanie Balaouras correctly states in Cloud DR Services are Real, a service provider that understands how to sell and deliver service to business customers (as hosters do today) can displace traditional disaster recovery solutions with better, cheaper and faster-to-provision DR services.

There are many more examples in addition to DR.  The point is, classic product management is needed: 

  • What do business customers want? 
  • How can we meet customer needs in the most scalable and cost-effective way? 
  • At what price?
  • Who makes the purchase decision? 
  • Does the product require a consultative sale, or can it be purchased by anyone with a Web browser and a credit card? 
Hosters are accustomed to doing product management for business customers.  Amazon and Google may develop these skills too, but so far Amazon Web Services is basically raw infrastructure.  That appeals to some market segments, particularly developers, but not all segments.

There is a large window of opportunity for all progressive hosting companies - and many other types of managed services providers - to enter the cloud computing market.  Rackspace has already done it and is demonstrating success.  SoftLayer has recently launched their cloud computing and cloud storage products. Other hosting providers that target large enterprise customers are deploying cloud computing and virtualization technologies in ways that meet the needs of their customers. 

This is only the beginning. 

Amazon and Google are great companies, but they will not prevent the wave of hosting companies and MSPs from playing a major role in the movement of IT infrastructure from the corporate closet to the Cloud. 
Too many think of cloud storage as just another or the next type of storage.  As usual with this view, it is associated with a view that the "next" storage type is bigger, faster and cheaper.  Because each generation of storage is always bigger, faster and cheaper.  As such, proponents of this view generally believe that access via traditional approaches, like WebDAV, NFS, cifs and others, is a critical capability.  Some may even argue that Web Services APIs are not the critical differentiation of Cloud Storage.  We disagree.

Cloud storage is a radical change.  It enables new application types.  The critical capability for cloud storage is a Web services API access, revealing the full promise of SOA (Service Oriented Architecture).  Second, the services that are revealed by the API access go far beyond "put" and "get".  Anytime and anywhere access, tagging, sharing and collaboration, geo storage via a single namespace, and policy management of storage are some of the services that the new applications will expect to find in the storage clouds they chose.  Also, storing massive amounts of data in the cloud and having these services available to act on all the data is required.

Finally, traditional access serves a specific role, to get legacy applications connected to the cloud.  Why, so that their data can easily enter the cloud and immediately take advantage of Cloud Storage services.  That's the primary requirement for supporting traditional access.  So, if you are thinking your Cloud Storage choice is driven by traditional access requirements, you are viewing Cloud Storage via the lens of traditional storage types, and you may ultimately be disappointed with your decision.  If your selection of Cloud Storage is based on exposing your stored data to SOA and new services capability, with storage that is abstracted from processing, then you will have made the appropriate strategic decision.

So, the innovators dilemma, is the thought that traditional access to a big back store is the critical issue associated with Cloud Storage selection.  Second, that the evaluation point is traditional access, storage size and performance, at a new price point.  That is the traditional approach.  That is the next step, and traditional storage providers will push to make this the list of requirements for what you  should buy.  It is simply the next turn of the crank in the storage world, the next  evolutionary step in storage.  It is not Cloud Storage.  

That is the way storage was.  Cloud Storage is about SOA, Web services APIs and advanced services revealed by these APIs, delivered via an abstracted storage solution, over a network, at low cost, for a large amount of storage.  As new applications arrive on the scene, powered by Cloud Storage, this will rapidly signal that something fundamental has happened.  A new storage type, driving new and creative applications, will allow for the creativity and skill of application developers to economically deliver the next generation of capabilities.  These new applications will require Cloud Storage, and the advanced services the storage cloud can deliver.  If all you want is bigger, faster, cheaper, you can solve your problem without a cloud, but you can solve this same problem with a cloud, and prepare yourself, and your data, for the future.

ALSO: Download the Cloud Storage Toolkit for Service Providers >>
oraclesun.gif

With the Sun acquisition, Larry "what's a cloud?" Ellison has once again changed the game. Here are a few key points to think about:

1) Oracle becomes the end-to-end IT enabler - from apps to disks; that's the party line.

2) Oracle begins the journey to the Cloud, and begins to develop the end-to-end Enterprise Cloud experience.

3) Oracle embraces the open source movement by attacking  Microsoft with MySQL.

4) Oracle's gain is clearly SAP's loss. Exadata + Sun = the new business intelligence?

5) Oracle owns Java, period. Ellison described Java as "the single most important software asset we have ever acquired." BONUS: they get the JMX API thrown in with the deal, which allows them to monitor all manner of resources.

6) Oracle delivers Peoplesoft-as-a-Service or Seibel-as-a-Service with credibility. Maybe they won't buy Salesforce.

7) Oracle pushes Open Office as a cloud offering to further disrupt Microsoft.

8) Oracle makes Sun hardware profitable.

9) One stop shopping for all your IT, from Cloud to your own data center - is where we are headed. The period of détente is over - Cisco, HP, IBM, and Oracle are racing to go to end-to-end environments, which HP and IBM have proven as a viable business model. What happens to Dell?

While the rumors fly all over the cloudsphere, what's important in the days ahead is how Oracle chooses to embrace the cloud - will it be an open or closed embrace?

With IBM, we all knew it would have been an open cloud, with Oracle the story is not so clear at all.  The silence on Jonathan's blog is deafening.

And for those of us who said that Ellison was kidding about the cloud, let's remember who said "the network is the computer!"  Today's netbooks are cloud devices.

My prediction: Oracle becomes one of the "Big Four" for the Enterprise, and quickly changes it's tune on the Cloud.

What's next? Anybody think Microsoft/Dell is an interesting combination? 

In his introductory post, Steve mentioned that when we were at VeriCenter we became aware of a new sort of challenge facing our industry - from outside entrants like Amazon.com who were beginning to grow their cloud storage and cloud computing businesses, leveraging internal web-based infrastructure. We knew then that cloud storage, as a significant part of an overall cloud computing strategy, was going to change our business and yes, our entire industry. Now we see new entrants like Google and Microsoft, as well as veterans like IBM and Sun, EMC, and Cisco are all delivering cloud-based offerings that are redefining the hosting and SaaS markets.

So here’s the question we’re asking: how will cloud computing disrupt your business model?

To answer this question, we start by defining a basic yet useful framework for understanding business models; this framework applies to businesses of all types and is adapted from Wikipedia:


bizmodel_small.gif

Let’s look at the nine components of the business model framework, and describe each one briefly.

Infrastructure

  • Core Capabilities: The capabilities and competencies necessary to execute a company’s business model. These include your key people, processes and technologies.
  • Partner Ecosystem: The business alliances which complement your capabilities.
  • Key Processes: The key activities (sometimes proprietary) which create the product or service you offer.

Offering

  • Value Proposition: The compelling benefits your customers receive from buying your products and services.

Customers

  • Customer Segments: The target audience for your products and services.
  • Distribution Channel: The means by which a company delivers products and services to customers. This includes the company’s marketing and distribution strategy, and may involve a chain of intermediaries, each passing the product down the chain to the next organization, before it finally reaches the consumer or end-user.
  • Customer Relationship: The links a company establishes between itself and its different customer segments.

Finances

  • Cost Structure: The nature of the expenditures required to run your business, particularly considering whether costs are fixed or variable in nature, and whether they are capital of operating expenses.
  • Revenue: The way a company receives payment from customers.
  • Profit: The degree to which your revenue exceeds your costs, of course!
Going forward, we’ll apply this generic business model framework to four specific industries we expect to experience business-model disruption from Cloud Computing:

  • IT hosting and disaster recovery companies
  • SaaS and Application Providers
  • Telecoms, and
  • Managed Service Providers.and VARS 
We’ll look at causes and types of disruption, keeping in mind that disruption can produce both positive and negative change. We’ll also tell you where the money is - i.e., the areas which offer the most promise in terms of business opportunity and profit.

Your feedback and comments are welcome as always.

Sponsors

About this Archive

This page is an archive of recent entries in the Cloud Ecosystem category.

Cloud Database is the previous category.

Cloud Infrastructure is the next category.

Find recent content on the main index or look in the archives to find all content.