Recently in Cloud Applications Category

Is OpenStack "Off the Rack"?

| Comments | TrackBacks
openstack.gifOn July 19, 2010, Rackspace led the announcement of OpenStack, with a goal of creating an open source cloud software solution for use on industry-standard hardware.  The initial releases contemplate solutions for both cloud compute and object storage.  While these are the first two releases, they are separate offerings.  Remember, cloud storage is not just the storage target for cloud computing, it is one potential storage target for cloud computing, and is in and of itself a stand alone cloud offering of programmable storage.

Now, I have purposely used a term from the clothing industry, "off the rack", to spend a moment looking at a framework for evaluating the opportunities this may present.  With dress shirts, you can buy off the rack, semi custom, or custom, each with a unique value proposition based on fit, choice and cost.   Interestingly enough, this may be a good lens through which to consider the possibilities of OpenStack, and in particular, OpenStack Object Storage.

Rackspace has made no secret of its motivations for leading this initiative, and its desire to focus on "fanatical" service as it's key differentiator versus the fundamental technology on which the service is based.  Fair enough, and so the question becomes, is the rapidly emerging and immature cloud marketplace already "mature" enough to seek homeostasis?  (Homeostasis is the property of a system, either open or closed, that regulates its internal environment and tends to maintain a stable, constant condition.)  Have enough models and innovations, from startups, academia, open source movements and large tech companies, been tested in the marketplace to the extent that we can already race to the common denominator?  Perhaps now is a good time to start, as long as you are willing to acknowledge that the desired results are a good ways off.

Before we jump off into "Off the Rack" software, a quick look back at open source is helpful.  For more reading on the open source software industry a good introduction is The Cathedral and the Bazaar. Six things are particularly interesting: 

  1. An open source alternative can emerge as a follow on to a successful commercial technology and can become pervasive versus the commercial offerings it succeeded (LINUX versus UNIX is the reference case here).
  2. A second result of this approach can also end up with a big success, although in more of a niche than a pervasive replace for the earlier commercial offerings (MySQL versus Oracle, IBM and Microsoft in the relational data base space).  
  3. An open source effort can also emerge earlier in a technology cycle and come of age as a pervasive solution (Apache Web Server comes to mind here).
  4. Open source generally requires very careful cultivation of the community of developers, with active interest by academia (and partnering with NASA is part of the formula here).  Commercially sponsored open source efforts are becoming more common, although it as of yet has not been proven as the typical "breeding ground" for most great open source successes.  Eucalyptus, with its roots at University of California Santa Barbara, seems to be a more traditional route.
  5. Open source is not necessarily reflective of rapid commercial opportunities for success.  Eucalyptus is obviously beginning to maneuver towards a repeat of the commercialization model.  OpenStack is taking the approach most favored by other open source successes like Apache.  A couple of good reads here are this article from BusinessWeek and this. See also Derrick Harris' post over at GigaOm.
  6. There are also hundreds of thousands of open source projects that had mixed success or languished altogether. A quick look at  SourceForge (an open source project hosting site) shows nearly a quarter million hosted projects. How many of these have languished or had little impact on the market.
So, the first issue is that there will exist for some time to come a real question as to the adoption potential of OpenStack.   I believe that adoption is driven by applicability to need.  In a moment we will address a serious issue which OpenStack Object Storage must overcome to be successful, at best, and at worst, will confine it to a niche market.  My views are very much directed at the Object Storage offering, versus the compute offering, which I believe exists in a different space and as a different type of solution.  With this backdrop, let's have a look at the cloud storage marketplace today, and use the analogy of off the rack, semi custom and custom:

  • Off the Rack:  implement as is, one size fits all, each with unique approaches for performance, scalability, bit integrity, may or may not provide geo services.
  • Semi Custom:  Select from storage types (DAS, SAN, NAS, JBOD), shared or distributed file systems and object systems, mix and match storage for different SLA and cost/usage patterns on the same infrastructure, multiple APIs, meta data and catalog abstracted from storage layer, geo services.
  • Custom:  Generally a service only offering and not available as deployable infrastructure, specifics will vary widely based on service provider offering strategy.

Infrastructure

Type

Comments

Eucalyptus

Off the Rack

Limited S3 APIs

OpenStack

Off the Rack

CloudFiles APIs

Scality

Off the Rack

S3 APIs

Mezeo

Semi Custom

Mezeo ReST APIs and S3 APIs

NetApp

Off the Rack

Bycast APIs, NetApp storage

EMC Atmos

Off the Rack

Atmos ReST APIs, EMC storage

Service

Type

Comments

Amazon S3

Custom

S3 APIs

Microsoft Azure

Custom

Windows centric

Rackspace

Off the Rack

Is the basis for OpenStack

Nirvanix

Custom

SOAP APIs, multi node

Google

Custom

Offers S3 APIs

AT&T Synaptic

Off the Rack

Based on EMC Atmos

OpSource, SoftLayer, Layered Tech and others

Custom

Based on Mezeo

As you can see from the summary above, there exist as many views of what constitutes either a cloud storage service or a desirable cloud storage deployable infrastructure as there are service providers and vendors.  Note that a semi custom infrastructure results in a "custom" service as implemented.  "Off the rack" results in very similar services by those who utilize the same infrastructure unless they make their own major additions.  Any offering can be differentiated by service, and the degree and quality of service is critical to customer satisfaction and plays a strong role in value creation.

The OpenStack announcement as it regards Object Store and its approach to cloud storage seems to view cloud storage infrastructure as highly akin to an operating system (or at least a "hypervisor") and more similar to a selection of LINUX or Windows than that of an application or middleware layer.  While I agree that cloud compute is very close to this model, cloud storage is a service oriented architecture, with programmability for new applications that can tolerate Internet latency because of Web Services (like ReST APIs). The industry constantly overlooks this key point as it is consumed with the low cost, pay for use and thin provisioning capabilities of this storage tier.  Solutions for thin provisioning and low cost have been available far longer than cloud storage. Further, pay for use is more of a business decision than a technology. 

In the earliest days of cloud storage, there existed initial confusion that cloud storage was defined by cost, scalability, pay for use, and thin provisioning only and not programmable access (usually via ReST APIs).  ParaScale paid a huge price for not understanding that cloud storage requires Web services (like ReST API) access.  Now, with OpenStack Object Store, we see a follow on case of this same perspective, but with basic APIs for Put, Get and List.   Yes, it provides for Internet access via ReST APIs, but the focus continues to be primarily cost based versus new application enablement based.  It could be argued that the open source approach will provide for the appropriate additions of "advanced services" to be added.  However, even the use of the platform by NASA is more focused on cost of storage than on advanced functionality because NASA stores much more data than almost any institution or enterprise in the world.

I think Savio Rodrigues states this view very well in his post:

"Select products based on business needs, not license alone: It's also interesting to note that very few enterprises are in NASA's position with regards to size of IT investment and skills in-house. While NASA engineers were ready and willing to contribute new features into the Eucalyptus open source community, few companies have the skills or governance to consider allowing their developers to contribute to open source projects.  Summary trend number 7 from the 2010 Eclipse survey results highlighted this issue.

To suggest that NASA's buying or IT decision making patterns represents much more than the top 1 percent of IT buyers would be a stretch."

The overwhelming majority of enterprises would rather pay a vendor to deliver, maintain, support and enhance their private cloud software infrastructure than place that burden on internal IT staff. Whether the enterprise is paying for a closed source commercial product, a commercial product based on an open core product, or a subscription to an open source product, the product selection decision will be made based on business requirements much broader than 'is the product open source or not?' "

Keep in mind that cloud storage is a stand alone service associated with application delivery over the Internet and also associated with low cost, pay for use, scalable storage resources.  Social media applications and many Web based applications exploit these capabilities; for example publishing a file to a URL and significant tagging of files.

This view of cloud storage as nothing more than cost and volume-based ignores its extraordinary importance as a service-oriented architecture for new application enablement.  I believe both views are equally important and need to be equally served.  Will OpenStack, with its pervasive cost focus, be able to drive its community to this additional view of needed contributions of advanced services for cloud storage?  Lydia Leong of Gartner Group provides an interesting view of the open source community issues associated with this in her post:

"At the same time, open sourcing is not necessarily a way to software success. Rackspace has a whole host of new challenges that it will have to meet. First, it must ensure that the roadmap of the new project aligns sufficiently with its own needs, since it has decided that it will use the project's public codebase for its own service. Second, it now has to manage and just as importantly, lead, an open-source community, getting useful commits from outside contributors and managing the commit process. (Rackspace and NASA have formed a board for governance of the project, on which they have multiple seats but are in the minority.) Third, as with all such things, there are potential code-quality issues, the impact of which become significantly magnified when running operations at massive scale."

One last comment on this business of vendor lock in and cloud storage APIs (another focus of the OpenStack announcement).  I would submit that while a specific set of APIs has the potential to create vendor lock in, this is a much smaller problem than what is experienced in other technologies.  If you are really worried about it, you probably have never actually written a ReST API call.  It is written in many languages, and we have seen cases where applications that run on S3 run unchanged on Mezeo.  Others need very minor modifications, and still others are excited to take advantage of some of the unique Mezeo services.  It just is not a problem, and this is much more related to FUD (fear, uncertainty and doubt) and marketing zealotry than it is associated with technological reality.  The APIs of choice will shake out, and it is far to early to say if it will be S3, OpenStack, CDMI or a combination of all of these, and others, as yet unforeseen.  (At Mezeo, we have never believed there will be one winner, and instead focused on architecture to enable easy and effective delivery of whichever APIs stand the test of time.)

The interesting view that seems to be missing here is that marketplace competition by service providers already serves to drive down the price of cloud storage, so
a commoditized stack embraced by most is unlikely to yield extraordinary incremental savings.  At the same time, while the competitive market conspires to drive cloud storage costs ever lower, the need to differentiate, and deliver solutions as well as a programmable storage to enable multiple new and exciting types of applications will rapidly replace the pure cost and scale focus of current cloud storage offerings.  Sometimes, the "new" application is simply enabling it in the cloud, to produce the same result at a lower cost!  This requires significant cloud storage functionality in order to make this easy and productive.  Amazon continues to prove this with their many additions and capabilities which differentiate their service.  Mezeo sees much the same view on the part of our customers.  The focus is on what cloud storage can do, what problems will it solve, what business opportunities does it create, what new applications can it enable and all of these views assume it will be competitively priced.

Cloud storage represents significant opportunities for institutions, the enterprise (see my recent post on the business case for enterprise cloud storage) and for the IT service provider.  Cloud storage is substantially different from cloud compute, and requires that you understand this difference in order to effectively evaluate the impact of this announcement, as well as your next steps.
Cloud Storage Strategy interviewed Gladinet co-founder Jerry Huang on cloud desktops, cloud gateways, and his company's business model. 

[NOTE: Gladinet is a customer of Mezeo Software.]

gladinetlogo.jpg

How does Gladinet position itself as the "desktop in the cloud?" What does that mean?
Actually we position ourselves as "a cloud on the desktop" instead of "a desktop in the cloud". The "desktop in the cloud" is more of an EC2 use case; you have a virtual machine in the cloud and use the Remote Desktop Protocol to access it.
 
"Cloud on the Desktop" is different. We view the PC as important infrastructure in this picture, because PC performance and functionality continue to improve, while broadband gets faster and cloud services leverage economies of scale, driving the price down or the SLA up. We see local storage growing side by side with cloud storage. We view the desktop as a feature rich portal where cloud storage and services live side by side with local storage and applications. The desktop provides an important platform these services to interact with each other.
 
How do you define the term Cloud Gateway? What is Gladinet's contribution to this space?
A cloud gateway is a piece of software or an appliance that facilitates connectivity between the end user's PC and cloud services.
 
Gladinet's CloudAFS (Cloud Attached File Server) has cloud gateway capability. It can help native CIFS/NFS clients (on an end user's PC) to connect through AFS and reach out to the cloud services. It can also help individual Cloud Desktops to reach out. Another important part of AFS is identity management. When you have a group of users with windows identities, the ID management is part of the functionality of a gateway. 
 
In our view, the Cloud Gateway is different from the Cloud Desktop Client that sits directly on the user's PC. While the desktop client serves one single user and one single PC, the Gateway serves a group of users and a group of PCs.

For the IT folks, how do you attach the Cloud to your existing IT infrastructure instead of migrating existing IT Infrastructure to the Cloud? How does this mitigate the risk and lower costs?
Different stages may have different usage patterns. We view the current stage (2009-2010) as an early stage of cloud storage adoption. If you tell a CIO now to throw away existing IT infrastructure and migrate to the cloud, it may not sell. If you tell a CIO to keep the existing IT infrastructure and expand it with the advantages that the cloud has, it may be easier to get adoption.  So we aligned our product and marketing messaging around attaching and expanding IT infrastructure in a non-disruptive way.  The picture we were painting is that you install CloudAFS and you then expand your existing file server with Cloud Storage. The existing file servers still runs, still providing file shares to existing users. Yet, the file server is backed up by the tier 2 cloud storage and the cloud storage may replace tape backup.

However, if we were in 2013 or2014 and looking back to this stage, we can view this expanding local IT infrastructure with Cloud as the starting stage of migration. When people start to experience the mixed environment of tier 1(local) and tier2 (cloud), they can see and experience how to best take advantage of both and can drive up cloud storage usage.
 
Mitigating the risk comes from a non-disruptive addition to the file server capacity. Lower cost can come from different places, like replacing tape backup.
 
How does Gladinet's business model give it a leg up over the competition? 
An analogy could be made with the start of the PC makers. At the beginning, there were many PC makers. IBM/Compaq/HP/Dell were the big ones, and there were also Packard Bell and other small ones. A successful business model then could be to create a component that all the PC makers can use instead of focusing on only on a few.

Today, there are many cloud storage vendors, mostly in the US. Clones from Germany, Japan and other countries are also coming as well. We believe creating a component that every cloud storage vendor can use to help cloud storage sales is more useful than focusing on just a couple of the big ones. 
jack-finlayson-web.jpgCloud Storage Strategy recently interviewed Layered Tech CEO Jack Finlayson on the economic benefits of cloud computing, the downturn, and virtual private data centers for the enterprise. 

NOTE: Layered Tech is a customer of Mezeo Software, the underwriter of this blog. 

Layered Tech has obviously focused on building a trusted infrastructure for customers.  How do you sustain that trust level?


As customer requirements and expectations continue to change, we've evolved in order to handle more detailed and complex requirements. We continually evaluate every aspect of our IT infrastructure and network to ensure we have the appropriate resiliencies and protections in place. We also take pride in our culture of continuous improvement in all aspects of customer service; it's the best way we can support our customers.

This is why we're here - to manage our customers' infrastructure so they can concentrate on their business.  From the beginning, it's all about providing ltechlogo.gifsuperior levels of customer support. When we begin a new customer relationship, we learn the customer's specific business needs and provide counsel on the best infrastructure solution to best support those needs and meet overall business goals. 

This is how we've built our reputation as a trusted provider with our customers around the globe, and our customers know they can count on Layered Tech for the highest quality infrastructure solutions and service.  With seven top tier data centers on three continents, we deliver secure, scalable and ultra-reliable solutions for IT infrastructure that support even the most complex enterprise requirements. 

We also maintain relationships with leading technology partners and keep up with our extensive certifications to ensure that we have the resources and expertise to deliver the best in managed dedicated hosting, cloud computing services and cloud-based storage.

Has the economic downturn helped accelerate the migration to cloud based data centers? 

Absolutely. The economic downturn has forced almost everyone to do more with less, which is why more companies are turning to cloud computing. 

Increased scalability and flexibility and a pay-per-use model creates a more cost-effective and agile infrastructure solution.  Customers leveraging Layered Tech's cloud computing and virtualization solutions reduce capital and operating expenses while enabling IT staff to focus on higher priority business needs rather than their infrastructure. 

Customers can also choose from a range of support options from Layered Tech's tiered managed services, ranging from the highest root-level access down to the lowest self-managed option with varying levels in between.  It's all about helping them to be flexible and do more with less.

Can you explain what the Virtual Private Data Centers (VPDC) platform service is?  Is this a flavor of the cloud computing model that people have been talking about?

Sure. We pioneered Virtual Private Data Centers - or VPDCs - which offer enterprise-class security, choice and flexibility.  It's a hybrid approach that gives customers dedicated, unshared resources in their off-premise cloud infrastructure rather than placing their data into purely public clouds.  VPDC platforms and private clouds are becoming popular cloud computing approaches for enterprises because they provide more control and security than public cloud offerings.

Whether it's an "internal private cloud" created and maintained by the enterprise's IT staff and housed within its onsite data center, or it's an "external private cloud," where the enterprise engages with a third-party hosting provider like Layered Tech to develop and operate a private cloud within one or more of the hosting company's data centers, enterprises want to have their own cloud infrastructure.  In other words, we believe that enterprises will not want clouds with shared resources, like those that exist in purely public cloud environments.

So, with the VPDC, customers gain the on-demand scalability of the cloud with all the reliability and security of dedicated servers.  The integrated virtualization platform also offers levels of managed services, security and flexibility via a proprietary API that were previously unavailable in an integrated offering. 


We created a maturity model for cloud storage just a few weeks ago. Can you tell us if it matches with your experience in the industries you serve?

Yes, it does. We believe that 2010 will be the first meaningful stage of cloud computing's rocket-ride of growth and enterprise usage, and it's all fueled by the need for further operational and financial improvement.  We've found that enterprises are seeking cloud computing benefits such as lower costs, higher productivity, greater speed to market, and near-instantaneous scalability of computing resources. 

It's important to note that CIOs now have an easier time showing their CEOs and CFOs the value of migrating to cloud computing and virtualized environments, especially considering the competitive advantages they create. The investment required to migrate to the cloud alone generates immediate short-term value, while also delivering long-term upside.  

Like your cloud storage maturity model, we think that custom migration plans and hybrid approaches to cloud computing also will be an emerging trend in 2010.  Enterprises will evaluate business drivers and align technology solutions to their corporate needs more closely than ever before.  The result will be the growing adoption of a hybrid approach, where a portion of the IT infrastructure stays in the physical, dedicated server world, while the remainder migrates into the cloud.

Finally, can you tell our readers about LT Depot, your cloud storage solution?

As you know, we just launched our new cloud storage solution called LT Depot, which is powered by the Mezeo™ Cloud Storage Platform.

LT Depot allows you to create and select scalable, reliable, and secure storage for your application and service needs. If you need storage for images, videos or critical documents without significant capital expense, LT Depot is your answer.

Not only is LT Depot designed as a robust and reliable storage avenue, it provides customers the extended advantages of sharing and collaboration. This provides features such as access, create, manage, and edit documents and files no matter where your users reside -- even from their mobile devices. Whether your team exists in one office, or multi-site locations around the world, stay connected and work together seamlessly and efficiently.

Cloud storage is already showing signs of Phase Two (see our post on the cloud storage maturity model), as a new set of solutions arrive in the marketplace.  These solutions are referred to as cloud gateways, on ramps, cloud clients, edge devices and other exotic names. 

For ease of discussion, lets use "cloud client" to describe a solution that is on a single user device (workstation, PDA, Tablet) and "cloud gateway" or just "gateway" for a solution that is delivered on a server or router for many users.  Whether they are a client or a gateway, some store a "blob" of data, and some store "chunks" of data that are parts of the original object.  Others store the actual object.  What's the difference and is it important? Should you consider it in your cloud gateway use plans?

What is a blob?  A blob can start as either a single object or a collection of objects, for example, all of the files on a single server, or a VM image.  Then, you do something to it in the client/gateway device that requires it to be brought back through the original client/gateway to be returned to a useful state.  Examples include de-duplication and compression followed by encryption prior to transmission of the object to the cloud (I call this D/C/E).  The result is a "blob" of data, an object that is minimized in size, and must be retrieved by the application that created it in order to be useful again. 

A chunk is part of an object, and the original object must be re-assembled by the gateway that parsed it in the first place. Some gateways store blobs.  Some store the object in chunks.  Finally, some store the actual object with its original file type, intact.  These may be workstation clients, or interface solutions that allow for a CIFS or iSCSI (today, TwinStrata is an example of the iSCSI capability) attached device to store in the cloud.  There are trade-offs and advantages associated with each approach, and your cloud storage use case and objective must be carefully analyzed in order to determine the applicability of the gateway to your business requirement.

Now, let's consider D/C/E.  This provides savings in addition to the savings associated with cloud storage.  D and C gives you a small object size, so your bandwidth cost is lower, and your overall storage cost is lower.  When there is a change to the stored objects, chunks allow you to send only the changed part of the object, reducing bandwidth and potentially improving performance.  Encrypting, or chunking, or both, may improve security and relieve you of the costs and management associated with other security approaches.

So, blobs and chunks sound pretty good, providing better security and lower costs.  What's the catch?  First, storage clouds are great places to provide anytime and anywhere access to your data, from multiple devices.  If you have to go back to a gateway to get the original version of the object, that flexibility may be very limited or non-existent.  Clouds are also a great place for sharing and collaboration, which is not in play if the object in the cloud is not in a useful form.  Finally, vendors are not giving gateway solutions away - we must ask what they cost, and are they worth it?

As usual, the answer is, it depends.  What services can I get from the cloud? And what services can I get from the gateway?

An example that is getting a lot of attention is file server replacement, or even better, file server displacement.  I get less excited about replacing a file server with another server that is a policy driven cache, because I still have this layer of technology in place.  However, if you can displace most of your file servers, then the potential for significant cost savings become obvious.  

I tend to look at single user clients as very interesting on ramps to the cloud.  A client, using some modest amount of workstation storage as a cache, can deliver most of the benefits of a file server.  Companies like Gladinet, SMEStorage, GoodReader, Mezeo and others have very interesting cloud clients.  You will still need a few file servers if you need to provide a place for very large files.  Interestingly enough, those very large files are often rich media (like training videos), and streaming them to a reader on the client from the cloud is often good enough.  Another cloud client capability we expect to see will allow the end-user to store files and move them across multiple storage providers - from private to public and vice-versa, for example.  This functionality could also be in a server-based gateway.

Another cloud client capability might include giving encryption capability to the end user, and let them decide if they want to encrypt the file themselves.   Or, use a cloud that provides user selectable encryption.  Give your end users or customers the power of choice, the freedom of access anytime and anywhere, the ability to get the amount of storage they need when they need it (what Gartner calls "reservationless", and kudos for them, great term).  Don't tie users to a "home base" gateway that does not store their object in it's original format, or at least give them a choice.  All that being said, we are seeing that some mix of clients for file server displacement, and file server replacement gateways may ultimately be the appropriate solution.  

Backup and archive is a different story, and here a gateway can make a lot of sense.  First, there is quite a bit of local housekeeping associated with these solutions, and the solution can decide if utilizing the cloud for some or all of the files makes sense. Speed of restore is a major consideration for a backup, and may drive local versus cloud based storage solutions.  Further, the need for a disaster recovery site, or to archive, can often be a cloud use case.  Companies like Zmanda and CommVault are very active in cloud based backup solutions.  What if you have applications that do not speak REST APIs, like a legacy backup solution?  There are gateways that can attach these legacy applications to the cloud, for example, TwinStrata.

Special purpose gateways can also solve an immediate problem.  Blue Thread offers a cloud storage interface for SharePoint.  The marketplace is rapidly developing a portfolio of cloud storage gateways and clients, as well as backup and archive solutions and all have their own unique perspective on cloud use.  Examples include StorSimple, Cirtas, Gladinet (who also makes clients), and EntropySoft.  Venture capital companies are deploying significant capital for these sorts of solutions.  Each of these solution providers sees a clear path to adding significant value to cloud storage solution delivery.

Cloud storage requires significant use case consideration to evaluate the functionality required, both in the cloud and in the gateway or client, and where the application or user can best exploit the functionality.  After all, cloud storage is also about empowering the end user with the storage they need, when they need it, at a favorable price, and providing advanced functionality, like publishing and sharing.

At Mezeo, we have both a deployable cloud infrastructure, and clients.  That causes us to look at where the best place to put the functionality is.  That creates a slightly different perspective, and we think it creates very useful products.  On the other hand, nothing gets us more excited than the thought of more solutions that drive cloud storage adoption and usefulness.  For this reason, we are rolling out a new marketing and certification program, Mezeo Ready

With Mezeo Ready™, service provider public storage clouds can easily identify their offering as being "Ready" for use by Mezeo Ready clients or gateways, and backup and archive solutions.  Users of these products can pick one of many trusted service providers hosting Mezeo Ready cloud storage solutions.  This cloud storage on ramp and cloud storage provider "ecosystem" ultimately delivers valuable solutions to customers and is a big part of Mezeo's vision for the cloud storage market.

So, more to come on Mezeo Ready, we are nearing the official announcement of the program, and will extend it to storage providers and file system providers who work with Mezeo to deliver storage clouds, both private and public.  Other solutions, like billing and provisioning systems will also be in the Mezeo Ready™ program.  The changes the cloud is delivering are new and useful, and deliver real value to the institutions and businesses that are embracing them.  The ecosystem is critical to the value delivery chain, and key to providing unique, desirable solutions.

Cloud Storage Redefined

| Comments | TrackBacks
The definition of cloud storage has been on my mind lately, and I think some attention to this topic is still called for.  From an article in CXO TodayCloud storage is not a disk array that you own, lease, or manage neither is it a virtual logical unit number (LUN) from a larger disk array. It is in fact it is offered via an application programming interface (API) through which you can send and receive data without having to actively manage the storage.

I see many "Cloud Storage" services and vendors of cloud storage infrastructure products that do not do or provide for what is described in the preceding observation.  For example, some cloud storage services are really offerings of storage that are associated with cloud computing.  One major requirement is storing and retrieving cloud computing images.  Since these are "bootable" the typical storage approach is an iSCSI connected storage resource.  A cloud computing image may require files for its application, and these are often stored on shared storage systems, and may be accessed in a variety of ways, but not necessarily via Web services APIs. 

Often, IT service providers call this shared storage; however, when it is accessed by cloud computing images, it is often referred to as cloud storage.  Finally, block data, like  a data base, is often required for the application running on a cloud computing image, and is accessed via iSCSI, and may be referred to as cloud storage. 

So, where do these observations lead us?

There are many benefits of a storage cloud, and for the user these include ease of access (in a variety of ways) to various amounts of storage on an as needed basis, with instant or nearly instant provisioning, with little if any traditional storage management requirements for the user.  IT service providers, and enterprise IT organizations are fundamentally organized around the premise of service delivery.  So, for both of these entities, a service offering like cloud storage is an important business asset, and the primary differences in the deployed infrastructure is associated with multi-tenancy (which, among other things, drives different security requirements) and billing. 

For many months, I have relied on the following definition of cloud storage: a persistent storage solution for objects (also called files or unstructured data) accessed via Web services APIs via a network (LAN or WAN). 

Today, I would like to move forward and offer a new definition, more encompassing, and reflecting not just a purist view but attempting to capture what is truly important for an IT service provider, in house or as a focus of a business (hosters, telcos, and cloud providers): 

"Cloud Storage provides whatever amount of storage you require, on an immediate basis.  It is persistent.  It can be accessed in a variety of ways, both in the data center where the cloud is housed, as well as via the Internet.  If you obtain this from an external provider, it is purchased on a pay as you go basis.  You do not manage it, you use it, and the service provider manages it." 

Here is how we depict this at Mezeo:

mezeocss.gif 

I strongly believe that obtaining, using, and decommissioning persistent storage in a simple, easy way, available in any quantity on a pay-for-use basis, and accessible in a variety of ways, via the Internet or at the data center where your application runs, is the heart of the matter.  If you get that service in house, or from an IT service provider, it should include the aforementioned characteristics.  This is a very inclusive definition, and it provides for traditional access methods, as well as programmable access (Web services APIs). 

Finally, here are three more points that are very important:

1) By Web Services API access, we mean API access to stored content!  This is different from APIs for storage management and is specific to a way of working with stored content.

 2)  New applications, and retrofits, will ultimately expect "programmable" storage.  This is a classic "Innovators Dilemma" scenario, I see it every day, and it is coming. 

3)  HTTP access (Web services API or "programmable") is not slower than other access, but it does tolerate the latency of the Internet.  As a result, you will ultimately see that HTTP access of storage in a data center will be a preferred approach, because of it's "programmability" and the desired performance.  This will not happen overnight, but it will happen.

A hat-tip to Stephen Foskett is in order as well.  Take a look at this entertaining article in which he struggles to find an appropriate name for "cloud storage."

According to a recent Gartner press release, 20% of businesses will own no IT assets by 2012:

Several interrelated trends are driving the movement toward decreased IT hardware assets, such as virtualization, cloud-enabled services, and employees running personal desktops and notebook systems on corporate networks.

The need for computing hardware, either in a data center or on an employee’s desk, will not go away. However, if the ownership of hardware shifts to third parties, then there will be major shifts throughout every facet of the IT hardware industry. For example, enterprise IT budgets will either be shrunk or reallocated to more-strategic projects; enterprise IT staff will either be reduced or reskilled to meet new requirements, and/or hardware distribution will have to change radically to meet the requirements of the new IT hardware buying points.
This is a bold statement. If we believe Gartner, it means that we are at the beginning of an explosion in cloud-based services managed by trusted providers on behalf of the enterprise. Of course not all businesses will choose this path, but a substantial number of industries can and will. As I blogged about earlier, the message from the CFO office is clear. We will see adoption rates rise dramatically as the benefits of cloud services become more obvious to business leaders.

A second point of interest is the prediction that by 2012, India-centric IT services companies will represent 20 percent of the leading cloud aggregators in the market (through cloud service offerings).

Here’s the take-away:

Gartner is seeing India-centric IT services companies leveraging established market positions and levels of trust to explore nonlinear revenue growth models (which are not directly correlated to labor-based growth) and working on interesting research and development (R&D) efforts, especially in the area of cloud computing. The collective work from India-centric vendors represents an important segment of the market’s cloud aggregators, which will offer cloud-enabled outsourcing options (also known as cloud services).
We are witnessing examples of what GE innovation consultant Vijay Govindarajan calls reverse innovation in IT. Natarajan Chandrasekaran, the CEO of Tata Consultancy Services notes:

I’ve seen the new cloud-based computing models for applications and processes gaining currency in emerging markets. Rural cooperative banks and small and medium businesses in India are actually far ahead of their western counterparts in adopting these models. In fact, companies from emerging markets, buoyed by strong domestic revenues and revival in growth, have been making adjustments to their global strategies and fine-tuning their investments in order to be part of the recovery process in the west and build on their global expansion plans.
As the enterprise embraces the cloud, they’ll need a maturity model to help them on their journey. My next post will explore what the maturity model for cloud storage looks like. 

The Parallels Summit has been very successful for Mezeo, with excellent booth traffic, a number of leads and we still have this afternoon to go. Our business development and partner discussions have also been productive.

Why blog about this? Because this is representative of two secular trends in the hosting industry. First, the industry is maturing, the business issues are more compelling and the opportunities and the vendors are more serious and engaged. Second, the interest in the cloud and cloud storage is at an all time high. It’s really that simple and that visible.

If you're accessing your data anytime, anywhere in the cloud, location shouldn't matter, right?

As it turns out, it does. There are several reasons why it matters where your cloud storage is located:

Legal & Regulatory Policy: How do companies ensure they are archiving and protecting business data to comply with  electronic data laws? According to BCS for example, no matter what data storage and security strategy an organization uses, IT decision makers should consider these six key questions:

  1. Will content be stored and remain unaltered over the required retention time frame?
  2. How will this technology stay updated to ensure long-term availability of records?
  3. Does this technology enable the organization to retrieve data quickly enough to respond to a legal request within the stipulated deadline?
  4. Can this technology grow with the business and meet regulatory requirements?
  5. Can this technology be used with other content generating applications?
  6. How will this data storage architecture address litigation and discovery challenges?
Add to this the effect of country and international compliance regimes and you understand why companies need to determine which data storage regulations affect them and require compliance.  Since the cloud is so new, I can safely wager that the data storage laws of most countries will not yet have a statute for the cloud. Thus, physical data storage laws will still apply.  So your cloud storage may have to be located in-country. This is possible through geo-location and geo-replication.

Performance: To reduce network latency, cloud storage and the applications that access it should be as close together as possible, even in the cloud, and they need to be close to the end-user.  Thus New York-based users who use NY-based applications should have their storage in a cloud in the NY area as well. 

Backup & Replication: Cloud-based backup and recovery makes sense as well. Having multiple instances of your data replicated by geography is a key function for distributed datacenter replication, and shows potential for rapid growth. 

So, at Mezeo, we see three ways to think about cloud storage and geographic options and how to improve the distribution of data across geographically distributed data networks:

Geo-Location: Locating stored objects close to where they will be used for. Faster access via the closest cloud storage instance using data center peering (this also allows you to define where you store your data/objects).

Geo-Replication: Replication through policies, with uninterrupted access to content.

Single Namespace: Providing a single means of access to stored objects regardless of where the objects are located.
 
Geographic placement supports creation of an object in a specific cloud storage instance.  At Mezeo, our replication policy allows for the specification of the locations of the replicants.  For example, the policy indicates "create the object in New York, LA, and Houston."  If an object is created in New York, it will be replicated to LA and Houston.  If it created in Houston, it will be replicated in New York and LA.

Some storage vendors support replication as a component of their disaster recovery recommendations.  If your selected storage vendor offers this option, then the storage solution could ensure there are at least two copies of every object in every instance of Mezeo's cloud storage.  Recovery in the case of disaster with this approach would be handled by the storage vendor's solution. 

By considering a combination of replication provided by storage vendors and replication provided by Mezeo, a service provider could offer a highly differentiated service.  Your customers would be assured of recovery in the case of any possible failure, from a single disk failure to a catastrophic data center loss.  Mezeo works with our service providers to determine the benefits of various replication options and the impact as you design your SLA level(s).

Policies are assigned in the onboarding/provisioning process and may be updated if requirements change.  There are also special situations for policy updates, such as if a particular data center has a catastrophic outage, the policies associated with replication to the Mezeo instance in that data center can be modified.
trebryan.jpgCloudStorageStrategy.com welcomes OpSource CEO Treb Ryan for an in-depth interview on cloud computing, from the perspective of the service provider.

NOTE: OpSource is a customer of Mezeo Software, the underwriter of this blog.


What are the opportunities you see in the cloud computing space, both for OpSource and your customers, and what impact has the downturn had on this?

It's interesting, but when people talk about cloud computing, they immediately go to the downturn and pricing - and cost being the big driver.  There's no question that cloud computing is cost effective, and it's accelerating adoption many times over, but what we're really seeing is something much more fundamental - a generation of users who are entering the workforce who've been using cloud computing all along; they've grown up on the Internet, and their interface to technology has always been through the Internet. 

As a result, this "Cloud Generation" has clear expectations of how technology should work:

1) it should be immediately available,
2) you do a search and get going,
3) it should be very flexible,
4) you should have ubiquitous access - anytime, anywhere,
5) sharing and collaboration - the expectation to collaborate and share anything they are working on.

This is not a generation which distinguishes between work data and home data - like my generation did. They've grown up with the concept of APIs and communities that grow around them; for instance, we see programmers who have grown up with Google and Facebook APIs, and now they expect that kind of thing in their work applications as well. So they're coming into the workforce and driving change in the workplace. They see technologies like client-server applications or hard-coded storage arrays pretty much the same way my generation saw green screens, mainframes, and mini-computers - as dated, inflexible, technology - hard to use, without nearly the power of cloud-based systems. So they have the day-to-day experience of the "consumer cloud" which they're now driving into business applications as well. 

To the Cloud Generation of programmers this means anything they can interact with on the Cloud they can program to through APIs. The idea of infrastructure being an item that can be addressed as part of the application, instead of something the application lays on top of, is a radical concept.  It has allowed not only for innovative applications, but also for true elastic computing making the Cloud environment even more flexible.

ops.gif

Great Cloud offerings have great communities around them. This is the aspect of Cloud computing that is so often missed - and even scoffed at - by the IT folks who think it's all about virtualization. One of the biggest gripes about Cloud computing is that support is done by the Community and not the vendor. While most will agree that far more proactive vendor support is necessary for Cloud computing, Community support is just as critical. For questions of configuration and usage tricks, the Community is a far better source of information than some call center employee with limited access. Often the Community devises more innovative solutions than the vendor ever could. And in addition to support, the Community can create third-party add-ins that make the Cloud even more useful.

The downturn has accelerated adoption from the top down as well.

We're seeing executives who have become enamored with this idea of the cloud - because of the ability to turn capital expenditures into operational expenses - and are pushing cloud computing into their organizations.  The CEO of one of our customers went so far as to tell his technical people - "now can you finally start using the cloud so I can get the board off my back?"

So, for different reasons, we have both top-down and grass-roots support for cloud-based applications, which makes this very interesting to say the least.

Which customer segments do you see leading the way in adoption?

Obviously, our traditional focus has been on ISVs and start-ups coming into Software-as-a-Service, business applications in the cloud, and we're seeing continued adoption of cloud infrastructure by those segments, but what has been interesting is that now that we offer the ability for any company to buy and use cloud infrastructure for any type of application, we're seeing a much broader spread of usage and adoption. Beyond the enterprise we also see widespread adoption by systems integrators, consultants, and VARs - upto 40% of our customer base - all without us targeting that segment at all.

How does OpSource differentiate its cloud offerings from other service providers?

We offer the best of the public cloud, combined with enterpise security and compliance, performance guarantees, and enterprise controls.

For instance, we offer:

  • easy online sign-up & purchase with infrastructure provisioning in minutes
  • pay by the hour and only for what you use, with no commitment (or purchase a monthly plan for a discount)
  • a rich online community to share and collaborate with peers; get third party add-ins, images and configurations
  • a web interface plus complete set of APIs
On the straight cloud, we provide a lot of the more robust, enterprise tools than you see from more consumer-based providers like Amazon, for example.

We focus on three different areas:

1) Security and Compliance: we provide a much more secure environment, because Opsource provides every customer with a Virtual Private Cloud within the public Cloud, allowing them to determine their own degree of public Internet connectivity. We also provide:

  • Unique customizable security for firewalls
  • VPN administration of all servers
  • Unique username/password for each administrator
  • Audit logs of all environmental changes
  • SAS 70 audited
  • 100% uptime SLA
2) Performance: we offer a multi-tier architecture with guaranteed latency in-between systems, sub-millisecond access time, industry standard technology, like VMware, instead of open-source, because that's where enterprise is comfortable.  Our 24/7 suppot also makes a diffence.

3) Control: today's cloud environment are single user environments, one user name and password, which is fine for individuals, but not so useful for the enterprise. We offer the ability to provision multiple users, do things like cross departmental billing, execute policy based control - which user can do what - and finally link all that back though an API to your existing management systems. So you can control how your users use the cloud same as you do your corporate datacenter.
So do you see any links into these large companies where they need to use ITIL for systems management?

Absolutely. OpSource has always focused on compliance as a major issue for our SaaS customers, eveything from SAS 70, PCI to European Safe Harbor, and even industry-specific ones like HIPAA, or government-specific certification, but in the cloud, we think about sophisticated  management techniques like federated authority and single sign-ons, and things like ITIL - while it's still in its infancy, it's shocking that most providers don't even have the ability to give their customers the critical capability to have more than one person manage the cloud for them - because they have a single user accounts. So while you can institute more sophisticated IT governance regimes like ITIL with the OpSource cloud, we give IT the capability to manage who does what, and track who did what, even if they aren't ready for something like ITIL.

So IT gets to do their own provisioning?   
  
Yes. So you want to know who provisioned what, how much it costs, and we give them that visibility instantly across their entire user community.  That way there are no surprises or charges they aren't aware of. It sort of reminds me of the controls I had to put in to alert me to my daughter's texting costs - so I'm aware of the charges before they get out of hand! I just blogged about this issue.

That's why you say that OpSource is what Amazon wants to be when it grows up... 

Absolutely.

And that's how you respond to cloud critics - the ones that say that the Cloud is not yet ready for the enterprise.

There are large parts of the cloud that are not yet ready for the enterprise. The cloud is still young, and it would be like asking that first 286 PC to run all of your corporate financials. However, a lot of these issues around enterprise adoption like security and compliance have been addressed, and are being taken care of, so as the cloud becomes more robust, we'll see increased adoption. We're seeing enterprise-level capabilities come to market that did not even exist six months ago.

We have just signed a partnership agreement under which OpSource will resell Gomez's Web performance management solution to our enterprise customers as well as use it to validate and monitor our own cloud performance service level agreements (SLAs). Through this partnership, we'll bring powerful performance monitoring to cloud computing, making it easier and more compelling than ever for enterprises to justify bringing their applications to the cloud.

Do you see infrastructure elements like storage growing now?

For true, full use of the cloud, we have to have the ability to access storage, go though the APIs to get to it, and give our customers a range of storage solutions, including cloud storage based on the specific application or need. We're giving our customers the widest range of choices.

What about agile programming? I heard you use agile methods to improve the customer experience.

Agile programming methods have helped us with not only development, but compliance and security as well. We talk to our customers to see how they are using our cloud offerings though our community, and we learn what's important to them.

We also test our offerings by having two programmers work on the same keyboard - literally  - one with the user story - so they can make sure that the customer is getting the exact functionality they need.

It's agile customer service.

Can you tell us a bit about your enthusiasm for composite applications (corporate mashups) and how they help your platform?

Of all the phenomenon in the cloud, we see the need for anytime-anywhere access and the idea that anything I can interact with I should also be able to program to.  So when Facebook enthusiasts start working in the enteprise, they bring their enthusiasm for integration as well.

So we see things in the cloud like direct access to the infrastructure as part of the application, which allows for all sorts of flexibility and robust usage.

We see real-time reporting applications of every kind you can imagine.  I myself am addicted to checking on everything that's coming out of our billing and customer systems tied into our Salesforce tabs.  So I'm always checking on the business in real-time via my iPhone.

I say this a lot, but integrating SaaS is a huge issue for today's enterprise. OpSource Connect can help SaaS companies -- of any size -- overcome integration hurdles and break out of the SaaS-only box. This speeds up adoption of SaaS in larger enterprise environments, opening the door for on-demand companies to cultivate business with large systems integrators. Plus, I'd say we're the only company providing Web operations from the ground up, addressing operational infrastructure, application management, and business operations. Today, integrations are expensive and one-to-one. For instance, while you can currently integrate your application with Google Maps as a composite application, OpSource Connect lets you integrate your app with many others, using just one platform. You can integrate your application with, for example, SAP, salesforce.com, Intuit QuickBooks, NetSuite, and a host of other SaaS and legacy applications. 

Everything is much more dynamic today, and programmers expect that. 
Here's an interesting read on some of the issues that traditional file systems face which can now be overcome with an object-based system. 

According to the author, Beth Pariseau:

Unstructured data is expected to far outpace the growth of structured data over the next three years. According to the "IDC Enterprise Disk Storage Consumption Model" report released last fall, while transactional data is projected to grow at a compound annual growth rate (CAGR) of 21.8%, it's far outpaced by a 61.7% CAGR predicted for unstructured data.

This is a di
rect result of the digital content explosion. 

Robin Harris, senior analyst at StorageMojo observes:

"There are going to be extreme amounts of data as things like digital video and mobile networks grow; in five years, pretty much every phone will be 'smart,'...All of us storage geeks agree on that, and different people are beginning to visualize what that kind of growth needs in terms of storage infrastructure."
The article makes the case to "Think APIs, not files."  In essence, the point is as follows (as explained by Harris):

"File systems make less sense over time as the amount of data grows. Architecturally, it makes more sense for each file to have a unique 128-bit ID and use an Internet-like system for locating that file; a URL points to an address and there are files at that address, and object-based storage interfaces are essentially operating on the same principle."
The result, writes Pariseau, is that "with an object ID replacing a file name, more extensive data can accompany an object than the simple 'created,' 'modified' or 'saved on' fields available in traditional file systems. Thus, detailed policies can be applied to objects for more efficient and automated management. Without NFS or CIFS to serve up files to applications, object-based storage systems need to replace that layer of abstraction between raw blocks of data on disk and files that applications can recognize. Today's object-based systems use standard APIs such as Representational State Transfer (REST) and Simple Object Access Protocol (SOAP), or proprietary APIs to tell applications how to store and retrieve object IDs.

One of our key decisions when we designed Mezeo was the adoption of object-based architecture for cloud storage.  Mezeo can use traditional file systems as object based systems to deliver cloud storage, and can also expose cloud storage as a traditional file system (even though it has objects underneath the covers, or as an object system).  This reflects our view that there will be a prolonged period of co-existence followed by a migration to object based systems.

If you'd like to learn more about how Mezeo offers an agnostic storage services platform for storage service providers (SSP), take a look at this paper (registration required) by the same Robin Harris: Building a scalable shared file infrastructure. The paper gives service providers an introduction to:

  • Cloud storage applications and customer drivers
  • Mezeo's storage architecture and options
  • Basic shared file storage reference designs
In the paper, Harris says that there are multiple ways to build highly scalable storage for cloud storage applications. He tells us how SSPs can differentiate their offerings:

The Mezeo platform allows the special features of the storage to be delivered to customers, while giving SSPs a powerful platform on which to build a business. Understanding what storage choices will better meet target market needs is a critical success factor. SSPs can differentiate their cloud services by careful selection of back end storage systems. The Mezeo platform gives SSPs great flexibility. Understanding how to use that flexibility will be key to growing a successful cloud storage service business.
Harris also presents five reference configurations (see diagrams below) in the paper, which vary in performance, availability, scalability, self-management and, of course, cost.

CONFIGU
RATION # 1: NEXENTA

config1_nexenta.gif



CONFIGU
RATION # 2: PERMABIT


config2_permabit.gif

CONFIGURATION # 3: PARASCALE

config3_parascale.gif


CONFIGURATION # 4: Red Hat Enterprise Linux

config4_Red-Hat-Enterprise-.gif


CONFIGURATION # 5: NetApp

config5_NetApp.gif


DOWNLOAD:

Robin Harris' Building a scalable shared file infrastructure >>

Sponsors

About this Archive

This page is an archive of recent entries in the Cloud Applications category.

Business Strategy is the previous category.

Cloud Computing is the next category.

Find recent content on the main index or look in the archives to find all content.