According to a recent Gartner press release, 20% of businesses will own no IT assets by 2012:
Several interrelated trends are driving the movement
toward decreased IT hardware assets, such as virtualization,
cloud-enabled services, and employees running personal desktops and
notebook systems on corporate networks.
The need for computing hardware, either in a data center or on an
employee’s desk, will not go away. However, if the ownership of
hardware shifts to third parties, then there will be major shifts
throughout every facet of the IT hardware industry. For example,
enterprise IT budgets will either be shrunk or reallocated to
more-strategic projects; enterprise IT staff will either be reduced or
reskilled to meet new requirements, and/or hardware distribution will
have to change radically to meet the requirements of the new IT
hardware buying points.
This is a bold statement. If we believe Gartner, it means that we are at the beginning of an explosion in cloud-based services managed by trusted providers on behalf of the enterprise. Of course not all businesses will choose this path, but a substantial number of industries can and will. As I blogged about earlier, the message from the CFO office is clear. We will see adoption rates rise dramatically as the benefits of cloud services become more obvious to business leaders.
A second point of interest is the prediction that by 2012, India-centric IT services companies will
represent 20 percent of the leading cloud aggregators in the market
(through cloud service offerings).
Here’s the take-away:
Gartner is seeing India-centric IT services companiesleveraging
established market positions and levels of trust to explore nonlinear
revenue growth models (which are not directly correlated to labor-based
growth) and working on interesting research and development (R&D)
efforts, especially in the area of cloud computing. The collective work
from India-centric vendors represents an important segment of the
market’s cloud aggregators, which will offer cloud-enabled outsourcing
options (also known as cloud services).
We are witnessing examples of what GE innovation consultant Vijay Govindarajan calls reverse innovation in IT. Natarajan Chandrasekaran, the CEO of Tata Consultancy Servicesnotes:
I’ve seen the new cloud-based computing models for
applications and processes gaining currency in emerging markets. Rural
cooperative banks and small and medium businesses in India are actually
far ahead of their western counterparts in adopting these models. In
fact, companies from emerging markets, buoyed by strong domestic
revenues and revival in growth, have been making adjustments to their
global strategies and fine-tuning their investments in order to be part
of the recovery process in the west and build on their global expansion
plans.
As the enterprise embraces the cloud, they’ll need a maturity model to help them on their journey. My next post will explore what the maturity model for cloud storage looks like.
The Parallels Summit has been very successful for Mezeo, with excellent booth traffic, a number of leads and we still have this afternoon to go. Our business development and partner discussions have also been productive.
Why blog about this? Because this is representative of two secular trends in the hosting industry. First, the industry is maturing, the business issues are more compelling and the opportunities and
the vendors are more serious and engaged. Second, the interest in the cloud and cloud storage is at an all time high. It’s really that simple and that visible.
A recent report by Forrester's Andrew Reichman titled Business Users Are Not Ready For Cloud Storage: Current And Planned Adoption Of Storage-As-A-Service Is Minimal For Now paints a picture for cloud storage adoption, that at first blush, is not encouraging.
He states:
In Forrester's Enterprise And SMB Hardware Survey, North America And Europe, Q3 2009 survey, we asked businesses about their interest in "hosted storage capacity" offerings. Interest was minimal at best. Forty-three percent of all respondents said that they were simply not interested, and another 43% said that they were interested but had no plans to move forward.
While it could be argued that as a cloud storage supplier, I am necessarily bullish about the ultimate prospects, I believe the data is actually quite good and clearly represents what we are experiencing in the marketplace. Now, Mezeo is engaged with many service providers, as well as the early adopters in the enterprise space as they begin their evaluations.
When I look at enterprise cloud-storage adoption based on Everett Rogers' diffusion curve I see a pretty clear view of the typical market place approach to adoption of disruptive technologies:
For new, emerging, and potentially disruptive technologies, we should look for what the next practices are, i.e. the practices of the innovators and early adopters. The survey reflects the typical technology adoption cycle and re enforces what we are experiencing in the market place.
11% of companies are taking the plunge - these are the early adopters and innovators. The early majority (43%) is interested, and watching. The late majority is not in the game, yet.
So we are on track. And to prove it, let's look at one of these enterprise-level innovators:General Electric.
According to IBM storage expert Tony Pearson, GE has implemented cloud-based backups and archive for GE Corp, NBC Universal
and GE Asset Management divisions running at only 32 cents per
GB/month, representing a 40-60 percent savings over their previous
methods. This includes backups of their external Web sites, archives of
their digital and production assets, RMAN backups including
development/staging databases. They plan to add out-of-region
compliance archive in 2010. They also plan to monetize their
intellectual property by offering "CloudStorage Manager" as a software offering for others.
There are other comments in the Forrester report that range from the usual concerns of security and multi-tenancy to a discussion around lack of definition of use cases. While it is helpful to raise these typical concerns, they are not descriptive of our daily marketplace experience. Rather, they are more associated with what I call the two pillars of cloud storage understanding. The two pillars are as follows:
If you share the Pillar 1 view (and this is the case both in the enterprise and with many traditional storage suppliers), then the typical concerns may outweigh the advantages. However, consider Pillar 2, which addresses new application enablement and new capabilities that enable security, multi-tenancy and use case definition (Pillar 1 concerns). Pillar 2 represents a market maturity view that is shared by all of us, suppliers, service providers, and early adopters.
Remember, cloud storage came about in the IT Service Provider space, specifically as a source of storage for new applications being driven by hosted web applications. These applications are now extending into every facet of the information technology space, including IT service providers, the enterprise, SMB and consumer use cases.
You can no more dismiss cloud storage than you could SaaS or the web itself!
We define hybrid cloud storage as utilization of private cloud storage at an enterprise data center, or a private cloud hosted by an IT service provider with some combination of additional IT service provider-based public and/or private cloud storage.
In a recent post, Cloud Storage for the Enterprise - Part 1: The Private Cloud, we covered the definition and requirements of cloud storage as an enterprise solution, and as a technology deployed within enterprise-owned data centers (or at least within their co- location racks and cages). Fundamentally, a private cloud is also a non multi-tenant cloud (i.e., used by only one entity or related parties within an enterprise or a public sector agency) that is behind the firewall(s). An additional solution that many enterprises are contemplating is the hybrid cloud, and we will look at the aspects of that solution in this post.
Before we begin our investigation of hybrid cloud, let's review some of the basics. The following diagram reviews the differences between public and private clouds:
Figure 1. Comparison of public and private cloud
Many enterprises are beginning their cloud evaluation with a "private cloud." I extend the definition of private cloud to be a "single tenant" cloud, as some enterprises may chose to use a single tenant cloud hosted at a service provider, versus hosting their cloud within their own data centers. In the following diagram, we show two private clouds, connected via policy-based replication in two data centers. This provides the assurance of backup and disaster recovery that many enterprises require. A third location could easily be added for even higher levels of backup and disaster recovery.
Figure 2. Private cloud inside an enterprise.
The growth of storage is driving increased costs, and the enterprise is on a continuous search to improve the way they can cost-effectively manage this growing data. The primary difference between hybrid cloud and private cloud is the extension of service provider-oriented low cost cloud storage to the enterprise. The service provider based cloud may be a private cloud (single tenant) or a public cloud (multi-tenant). There are several implementations of hybrid cloud, and several examples are included. The service provider cloud may enable enterprises to leverage the volume efficiencies of the service providers to realize additional savings.
A hybrid cloud provides a way of securely using service provider-based cloud storage in combination with enterprise clouds. Another implementation could be use of single tenant service provider-based private clouds at multiple locations.
Some examples of hybrid clouds are offered for your consideration, although not every potential approach is covered herein:
Figure 3. Hybrid cloud variation 1: private cloud inside an enterprise affiliated with a public cloud via a service provider.
an enterprise with affiliated private cloud via a service provider.
Figure 5. Hybrid cloud variation 3: Private clouds at a service provider with multiple clouds.
Since the primary motivation for hybrid cloud is economics, let's begin the discussion with an understanding of the economics of cloud storage and then extend that discussion to the hybrid cloud environment.
The primary cost components of cloud storage include:
1. Data center occupancy - leased (co-location) or owned and depreciated. 2. Data center environmental - utilities, cooling, heating, etc. 3. Storage hardware (leased expense or capital requirements & associated depreciation). 4. File system and storage management (may be bundled in the storage hardware). 5. Cloud enablement or platform (discreet or bundled with the storage system). 6. Systems management and operational overhead. 7. Backup and disaster recovery.
While it can be argued that the economics at a large scale enterprise are very similar to those at a service provider, listed below are some of the most common reasons enterprises do turn to service providers for their technology solutions:
1. Capital conservation. 2. Distraction associated with infrastructure management. 3. Desire to outsource functions that are required but not associated with core competency (focus dilution). 4. Poor history of infrastructure management. 5. Specific issues, for example, out of data center space and not projecting long term needs to add additional data centers, or unable to expand existing data centers and no desire for an additional site. 6. Redundancy of networks available in data centers that may not be available in the enterprise with assuming additional costs.
Whatever the reason, service providers can solve these problems. In each of the three hybrid cloud scenarios, there are costs and security tradeoffs that each cloud use-case will consider. For example, in hybrid cloud variation #1, the economics can be quite appealing, but there are significant security concerns. One approach to mitigate these concerns is to encrypting an object before replication to a public cloud might mitigate the threat.
Understanding where key functionality is applied in your cloud stack is critical for successful implementation and highly dependent on the cloud and storage subsystem technology, cloud interoperability capabilities, and data use case. Critical technologies that provide benefits are: de-duplication, compression, encryption for data at rest and data in motion, geo location, geo replication, tagging and search capabilities, and cloud access methods. I will address underlying cloud technology requirements for the enterprise in my next post.
Cloud Use Case Definitions:
Data Archiving - Storing data for retention management requirements (such requirements may be internally generated, or associated with regulatory and compliance needs). Archive data must be highly secure, highly reliable over the archive period, and easily searchable. Archive data is generally encrypted, compressed and stored in a proprietary format. Access to the data is usually very infrequent and thus typical enterprises have leveraged slower access, cheaper tape media or redundant NAS to control costs. Typical data issues associated with archiving are maintaining the archive and eliminating what is known as bit rot of the data, which is where data becomes corrupt if stored in the same media for long periods of time and not accessed.
Data Backup - Storing data as a replacement copy in the event the original copy is somehow damaged or lost due to user error, system failure, or as a result of a disaster scenario. Back up data may or may not need to be highly secure or easily searchable, but must be available for quick restore when needed. This data is also generally encrypted, compressed and stored in a proprietary format. Access to the data is more frequent than with archive data and can be at any level of the organization. A single file, user, server, site, or the entire enterprise could potentially need to be restored to proper service and backup data must support these highly variable access needs.
Data Access - Storing data in its original format for access by users or other applications. This type of data is frequently accessed and is the superset of the data that comprise backup and archive data. Access takes precedence over security, but needs to be easily and quickly searchable and retrievable by users and applications and thus highly available. Typical issues with access data are the need for fast accessibility of frequently used data balanced against the overall cost associated with storing all the data. Enterprises often implement tier strategies to stage data in progressively lower cost media based on frequency of access.
Figure 6. Hybrid enterprise use case cloud technology requirements.
Hybrid cloud storage, which we have loosely defined as utilization of private cloud storage at an enterprise data center, or a private cloud hosted by an IT service provider with some combination of additional IT service provider-based public and/or private cloud storage, offers an approach that allows use case, economics and security to prevail when selecting the appropriate approach. Implementation will also be driven by the technological capabilities of the three building blocks of cloud storage, the cloud abstraction layer, file/object system choice and storage subsystem hardware.
So, our discussion of hybrid cloud storage has likely demonstrated at least one significant additional aspect, and that is complexity. Starting with use case definition and security requirements, combined with a clear understanding of the unique issues within each enterprise that effect cost, you can map a clear path to the cloud technology and selection of one or more cloud service providers. Finally, the trusted service provider continues to be another significant requirement for exploitation of hybrid cloud.
Security will continue to be a big issue for the cloud, and,
unfortunately, there will be at least one event this next year that is
disruptive to Cloud Storage adoption, be it data loss or unauthorized
data access. Security will be an even more important point of
evaluation for the use of specific Cloud Storage service offerings. The
“trusted service provider“ becomes a requirement when selecting a cloud offering.
Cloud Storage will be characterized by a single word, “more”!
More adoption, more cloud storage offerings by more IT service
providers, more variation in cloud capabilities, and more worries and
concerns about the cloud.
The intersection of enhanced mobile devices with better wireless bandwidth will be combined with Cloud Storage to create exciting new work/life blended digital life applications. The user experience is of paramount importance.
Cloud Storage will see extraordinary adoption as a solution for backup,
archiving and for policy-based georeplication for disaster recovery.
If you're accessing your data anytime, anywhere in the cloud, location shouldn't matter, right?
As it turns out, it does. There are several reasons why it matters where your cloud storage is located:
Legal & Regulatory Policy: How do companies ensure they are archiving and protecting business data to comply with electronic data laws? According to BCS for example, no matter what data storage and security strategy an organization
uses, IT decision makers should consider these six key questions:
Will content be stored and remain unaltered over the required retention time frame?
How will this technology stay updated to ensure long-term availability of records?
Does this technology enable the organization to retrieve data
quickly enough to respond to a legal request within the stipulated
deadline?
Can this technology grow with the business and meet regulatory requirements?
Can this technology be used with other content generating applications?
How will this data storage architecture address litigation and discovery challenges?
Add to this the effect of country and international compliance
regimes and you understand why companies need to determine which data
storage regulations affect them and require
compliance. Since the cloud is so new, I can safely wager that the
data storage laws of most countries will not yet have a statute for the
cloud. Thus, physical data storage laws will still apply. So your
cloud storage may have to be located in-country. This is possible
through geo-location and geo-replication.
Performance: To reduce network latency, cloud storage and the applications that access it should be as close together as possible, even in the cloud, and they need to be close to the end-user. Thus New York-based users who use NY-based applications should have their storage in a cloud in the NY area as well.
Backup & Replication: Cloud-based backup and recovery makes sense as well. Having multiple instances of your data replicated by geography is a key function for distributed datacenter replication, and shows potential for rapid growth.
So, at Mezeo, we see three ways to think about cloud storage and geographic options and how to improve the distribution of data across geographically distributed data networks:
Geo-Location: Locating stored objects close to where they will be used for.Faster access via the closest cloud storage instance using data center peering (this also allows you to define where you store your data/objects).
Geo-Replication: Replication through policies, with uninterrupted access to content.
Single Namespace: Providing a single means of access to stored objects regardless of where the objects are located.
Geographic placement supports creation of an object in a specific cloud storage instance. At Mezeo, our replication policy allows for the specification of the locations of the replicants. For example, the policy indicates "create the object in New York, LA, and Houston." If an object is created in New York, it will be replicated to LA and Houston. If it created in Houston, it will be replicated in New York and LA.
Some storage vendors support replication as a component of their disaster recovery recommendations. If your selected storage vendor offers this option, then the storage solution could ensure there are at least two copies of every object in every instance of Mezeo's cloud storage. Recovery in the case of disaster with this approach would be handled by the storage vendor's solution.
By considering a combination of replication provided by storage vendors and replication provided by Mezeo, a service provider could offer a highly differentiated service. Your customers would be assured of recovery in the case of any possible failure, from a single disk failure to a catastrophic data center loss. Mezeo works with our service providers to determine the benefits of various replication options and the impact as you design your SLA level(s).
Policies are assigned in the onboarding/provisioning process and may be updated if requirements change. There are also special situations for policy updates, such as if a particular data center has a catastrophic outage, the policies associated with replication to the Mezeo instance in that data center can be modified.
As we enter 2010, I am going to focus on a series of articles to define the cloud storage opportunity and the business issues for the enterprise. First, there are some "universal truths" that we need to better understand and define.
The growth in unstructured data will continue, unabated. We all know and understand that. The issue is how to manage this phenomenon, while operating with the assumption that the growth will likely accelerate. Since the growth is driving increased costs, the enterprise is on a continuous search to improve the way they can cost-effectively manage this growing data.
Data may exist on removable media, on PCs and PDAs, on various servers within the organization, at data centers, at remote facilities, and potentially at various outsourced service providers. The data may range from employee personal information (and even personal information from the employees associates) that is not associated with the needs of the business to non-confidential and confidential business information, some of which may be highly critical. Disparate policies will need to be applied to the data ranging from no control to extreme control. Of course, there will be the existence of multiple versions of files adding to the total storage and further exacerbating the challenges of management.
There are many potential solutions to the problem as stated above, and most of them involve some sort of additional controls, policies and restrictions that control the proliferation of data and make it more orderly and secure. These solutions are then combined with additional focus on reducing storage costs by staying aligned with new storage technology (which continues to reduce costs of storage), and the cycle repeats, endlessly. In each cycle, trade-offs associated with costs, availability, security, access, restrictions occur, and rarely is there a "perfect" solution.
Is cloud storage a possible solution to the issues as surfaced above? Is it a discontinuity, a departure, from the "business as usual" cycles associated with ongoing, incremental and continuous storage improvements when new technologies are introduced as they can be accommodated?
Let's start with discussing cloud storage and its various capabilities. Note that we are talking about a storage cloud that is housed at the enterprise data center, not a storage service provider.
(1) First, centralize the storage problem:
Cloud Storage addresses the necessary size and scale of unstructured data growth in the enterprise. Generally, highly scalable file systems, including newer object based systems, provide the ability to manage incredibly large numbers of objects (objects of all sizes) in an efficient fashion. This is combined with low cost commodity storage devices and servers. Then a centralized storage pool is ready for use. It is generally easy to add additional storage to this pool, and both backup and disaster recovery schemes are in place. So, the first well known method of problem solving that cloud storage utilizes is "centralization." Let's get a solution in place that we know can scale to the size of the data needs of the enterprise.
(2) Second, make it easy to use:
You can't use it if you can't get it, and this is where the topic of "thin provisioning" emerges. Thin provisioning just means that it is easy to get a storage account (whether I am an individual user or an application / server) and I can get it quickly, no matter how much I need (in theory). Further, as my storage needs increase, it is easy to get more - quickly. There are issues like accounting for storage; managing growth and billing for it that also surround the notion of thin provisioning.
Access is another big topic that surrounds ease of use. The enterprise has multiple needs here. Legacy applications, utilizing file access methods like CIFS or NFS, will want to utilize the storage cloud. New applications, written to REST Web services APIs, will also want to coexist. Finally, individual users will want access from all their device types, including PCs (Windows and Mac, Linux), the Web, and PDAs. All of this access manifests itself in interesting ways, including identity management of the credentials associated with using the service, bandwidth requirements for accessing the service from many diverse locations, and geo location of data (i.e., if you have several locations where the cloud data is kept, how do you decide which location to use?).
(3) Third, sync your files to the cloud:
Now that you have cloud storage, you ought to think about backup and sync to the cloud. These two applications are different but somewhat linked. Sync to the cloud can be used for both cloud loading (getting the data from the device to the cloud, in a background way so that the latency will not be a problem) as well as keeping a current copy in the cloud, but using the local copy on your device (the best of both worlds). Since your most current copy is in the cloud, it is your backup copy. Sync is also a solution for keeping files "sychronized" between devices and the cloud, so you always have an authoritative source of your file stored in the cloud. Of course, all this is based on having cloud access from any device, anywhere (see number two, above).
(4) Fourth, create new, higher impact applications with programmable storage:
Programmable (using http, SOAP or REST APIs) access to storage is the next big revolution in storage. Tagging, sharing, collaboration, easy search, easy and secure access and multiple views make creating new, high impact applications easier than before. Take advantage of new functionality that is easily delivered. Create applications that rely on your data and data that is external to the enterprise. Develop these applications quickly and at lower cost. If all you want is cheaper storage, you may be able to get by without a cloud, but without this capability you are missing the revolution that is upon us.
(5) Fifth, secure your cloud:
In my own survey of the industry, security is the major issue on the minds of the IT department evaluating cloud storage for the enterprise. Several different aspects of security come into play. Many of these issues are most often associated with using a multi-tenant storage cloud from a storage service provider. Nevertheless, four major security issues prevail before we even begin to consider the issues of going to the cloud at a service provider.
The four issues are: physical security, unauthorized access, data loss (disaster or device failure related) and bit rot (a subset of data loss, granted). All of these issues are no different than what you face with your traditional shared storage solutions and most of the solutions are similar. Your current IT physical security solutions apply to an enterprise hosted cloud. The identity management policies and practices associated with creating and maintaining account credentials address unauthorized access, just as they do with your current data management practices. Encryption can provide additional protection from unauthorized access. As a matter of fact, the security issues are already in play with your current storage methodology, so nothing new here, unless you move to a service provider hosted cloud (more on this later).
(6) Sixth, lower the cost of storage:
Cloud storage delivers the benefits as discussed in items one through four above, while requiring similar security to current storage activities. How does it address costs? First, cloud storage solutions generally allow for using commodity hardware, very scalable file systems, and highly automated provisioning and management solutions. So, the hardware price equation of differentiation and premium pricing is disrupted. True, the software doesn't come cheap, but remember that the public cloud storage services are "making the market" and the combination of commodity hardware, environmentals, and enabling software (file system, management and middleware from one or more suppliers) is meeting the external marketplace pricing. Here is a simple model you should use (all figures expressed in cents/GB/Mo):
Commodity Hardware depreciation $ .02 Environmentals (data center, power and cooling) .02 Management (primarily people resources) .02 Enabling Software .03 Other .01
Total costs: $ .10 (10 cents/GB/Month)
This represents a significant saving for a solution that provides all the capabilities that cloud storage delivers. What's the catch? Well, not every type of application and use case for unstructured data is ideally served by cloud storage. However, many are, and the exceptions should be dealt with as one offs. The real catch is not taking advantage of this new technology, and all the opportunities it offers, for lowering cost while delivering improved capabilities to end users and applications around the enterprise.
My next post will discuss hybrid, private and public cloud storage offerings, and where savings and security can drive significant benefits for enterprises who take advantage of the cloud storage offerings of service providers.
What are the opportunities you see in the cloud computing space,
both for OpSource and your customers, and what impact has the downturn
had on this?
It's interesting, but when people talk about cloud computing, they immediately go to the downturn and pricing - and cost being the big driver. There's no question that cloud computing is cost effective, and it's accelerating adoption many times over, but what we're really seeing is something much more fundamental - a generation of users who are entering the workforce who've been using cloud computing all along; they've grown up on the Internet, and their interface to technology has always been through the Internet.
As a result, this "Cloud Generation" has clear expectations of how technology should work:
1) it should be immediately available, 2) you do a search and get going, 3) it should be very flexible, 4) you should have ubiquitous access - anytime, anywhere, 5) sharing and collaboration - the expectation to collaborate and share anything they are working on.
This is not a generation which distinguishes between work data and home data - like my generation did. They've grown up with the concept of APIs and communities that grow around them; for instance, we see programmers who have grown up with Google and Facebook APIs, and now they expect that kind of thing in their work applications as well. So they're coming into the workforce and driving change in the workplace. They see technologies like client-server applications or hard-coded storage arrays pretty much the same way my generation saw green screens, mainframes, and mini-computers - as dated, inflexible, technology - hard to use, without nearly the power of cloud-based systems. So they have the day-to-day experience of the "consumer cloud" which they're now driving into business applications as well.
To the Cloud Generation of programmers this means anything they can interact with on the Cloud they can program to through APIs. The idea of infrastructure being an item that can be addressed as part of the application, instead of something the application lays on top of, is a radical concept. It has allowed not only for innovative applications, but also for true elastic computing making the Cloud environment even more flexible.
Great Cloud offerings have great communities around them. This is the aspect of Cloud computing that is so often missed - and even scoffed at - by the IT folks who think it's all about virtualization. One of the biggest gripes about Cloud computing is that support is done by the Community and not the vendor. While most will agree that far more proactive vendor support is necessary for Cloud computing, Community support is just as critical. For questions of configuration and usage tricks, the Community is a far better source of information than some call center employee with limited access. Often the Community devises more innovative solutions than the vendor ever could. And in addition to support, the Community can create third-party add-ins that make the Cloud even more useful.
The downturn has accelerated adoption from the top down as well.
We're seeing executives who have become enamored with this idea of the cloud - because of the ability to turn capital expenditures into operational expenses - and are pushing cloud computing into their organizations. The CEO of one of our customers went so far as to tell his technical people - "now can you finally start using the cloud so I can get the board off my back?"
So, for different reasons, we have both top-down and grass-roots support for cloud-based applications, which makes this very interesting to say the least.
Which customer segments do you see leading the way in adoption?
Obviously, our traditional focus has been on ISVs and start-ups coming into Software-as-a-Service, business applications in the cloud, and we're seeing continued adoption of cloud infrastructure by those segments, but what has been interesting is that now that we offer the ability for any company to buy and use cloud infrastructure for any type of application, we're seeing a much broader spread of usage and adoption. Beyond the enterprise we also see widespread adoption by systems integrators, consultants, and VARs - upto 40% of our customer base - all without us targeting that segment at all.
How does OpSource differentiate its cloud
offerings from other service providers?
We offer the best of the public cloud, combined with enterpise security and compliance, performance guarantees, and enterprise controls.
For instance, we offer:
easy online sign-up & purchase with infrastructure provisioning in minutes
pay by the hour and only for what you use, with no commitment (or purchase a monthly plan for a discount)
a rich online community to share and collaborate with peers; get third party add-ins, images and configurations
a web interface plus complete set of APIs
On the straight cloud, we provide a lot of the more robust, enterprise tools than you see from more consumer-based providers like Amazon, for example.
We focus on three different areas:
1) Security and Compliance: we provide a much more secure environment, because Opsource provides every customer with a Virtual Private Cloud within the public Cloud, allowing them to determine their own degree of public Internet connectivity. We also provide:
Unique customizable security for firewalls
VPN administration of all servers
Unique username/password for each administrator
Audit logs of all environmental changes
SAS 70 audited
100% uptime SLA
2) Performance: we offer a multi-tier architecture with guaranteed latency in-between systems, sub-millisecond access time, industry standard technology, like VMware, instead of open-source, because that's where enterprise is comfortable. Our 24/7 suppot also makes a diffence.
3) Control: today's cloud environment are single user environments, one user name and password, which is fine for individuals, but not so useful for the enterprise. We offer the ability to provision multiple users, do things like cross departmental billing, execute policy based control - which user can do what - and finally link all that back though an API to your existing management systems. So you can control how your users use the cloud same as you do your corporate datacenter.
So do you see any links into these large companies where they need to use ITIL for systems management? Absolutely. OpSource has always focused on compliance as a major issue for our SaaS customers, eveything from SAS 70, PCI to European Safe Harbor, and even industry-specific ones like HIPAA, or government-specific certification, but in the cloud, we think about sophisticated management techniques like federated authority and single sign-ons, and things like ITIL - while it's still in its infancy, it's shocking that most providers don't even have the ability to give their customers the critical capability to have more than one person manage the cloud for them - because they have a single user accounts. So while you can institute more sophisticated IT governance regimes like ITIL with the OpSource cloud, we give IT the capability to manage who does what, and track who did what, even if they aren't ready for something like ITIL.
So IT gets to do their own provisioning? Yes. So you want to know who provisioned what, how much it costs, and we give them that visibility instantly across their entire user community. That way there are no surprises or charges they aren't aware of. It sort of reminds me of the controls I had to put in to alert me to my daughter's texting costs - so I'm aware of the charges before they get out of hand! I just blogged about this issue.
That's why you say that OpSource is what Amazon wants to be when it grows up... Absolutely.
And that's how you respond to cloud critics - the ones that say that the Cloud is not yet ready for the enterprise.
There are large parts of the cloud that are not yet ready for the enterprise. The cloud is still young, and it would be like asking that first 286 PC to run all of your corporate financials. However, a lot of these issues around enterprise adoption like security and compliance have been addressed, and are being taken care of, so as the cloud becomes more robust, we'll see increased adoption. We're seeing enterprise-level capabilities come to market that did not even exist six months ago.
We have just signed a partnership agreement under which OpSource will resell Gomez's Web performance management solution to our enterprise customers as well as use it to validate and monitor our own cloud performance service level agreements (SLAs). Through this partnership, we'll bring powerful performance monitoring to cloud computing, making it easier and more compelling than ever for enterprises to justify bringing their applications to the cloud.
Do you see infrastructure elements like storage growing now?
For true, full use of the cloud, we have to have the ability to access storage, go though the APIs to get to it, and give our customers a range of storage solutions, including cloud storage based on the specific application or need. We're giving our customers the widest range of choices.
What about agile programming? I heard you use agile methods to improve the customer experience.
Agile programming methods have helped us with not only development, but compliance and security as well. We talk to our customers to see how they are using our cloud offerings though our community, and we learn what's important to them.
We also test our offerings by having two programmers work on the same keyboard - literally - one with the user story - so they can make sure that the customer is getting the exact functionality they need.
It's agile customer service.
Can you tell us a bit about your enthusiasm for composite applications (corporate mashups) and how they help your platform?
Of all the phenomenon in the cloud, we see the need for anytime-anywhere access and the idea that anything I can interact with I should also be able to program to. So when Facebook enthusiasts start working in the enteprise, they bring their enthusiasm for integration as well.
So we see things in the cloud like direct access to the infrastructure as part of the application, which allows for all sorts of flexibility and robust usage.
We see real-time reporting applications of every kind you can imagine. I myself am addicted to checking on everything that's coming out of our billing and customer systems tied into our Salesforce tabs. So I'm always checking on the business in real-time via my iPhone.
I say this a lot, but integrating SaaS is a huge issue for today's enterprise. OpSource Connect can help SaaS companies -- of any size -- overcome integration hurdles and break out of the SaaS-only box. This speeds up adoption of SaaS in larger enterprise environments, opening the door for on-demand companies to cultivate business with large systems integrators. Plus, I'd say we're the only company providing Web operations from the ground up, addressing operational infrastructure, application management, and business operations. Today, integrations are expensive and one-to-one. For instance, while you can currently integrate your application with Google Maps as a composite application, OpSource Connect lets you integrate your app with many others, using just one platform. You can integrate your application with, for example, SAP, salesforce.com, Intuit QuickBooks, NetSuite, and a host of other SaaS and legacy applications.
Everything is much more dynamic today, and programmers expect that.
Here's an interesting read on some of the issues that traditional file systems face which can now be overcome with an object-based system.
According to the author, Beth Pariseau:
Unstructured
data is expected to far outpace the growth of structured data over the
next three years. According to the "IDC Enterprise Disk Storage
Consumption Model" report released last fall, while transactional data
is projected to grow at a compound annual growth rate (CAGR) of 21.8%,
it's far outpaced by a 61.7% CAGR predicted for unstructured data.
This is a direct result of the digital content explosion.
Robin Harris, senior analyst at StorageMojo observes:
"There
are going to be extreme amounts of data as things like digital video
and mobile networks grow; in five years, pretty much every phone will
be 'smart,'...All of
us storage geeks agree on that, and different people are beginning to
visualize what that kind of growth needs in terms of storage
infrastructure."
The article makes the case to"Think APIs, not files." In essence, the point is as follows (as explained by Harris):
"File
systems make less sense over time as the amount of data grows. Architecturally, it makes more sense for
each file to have a unique 128-bit ID and use an Internet-like system
for locating that file; a URL points to an address and there are files
at that address, and object-based storage interfaces are essentially
operating on the same principle."
The result, writes Pariseau, is that "with
an object ID replacing a file name, more extensive data can accompany
an object than the simple 'created,' 'modified' or 'saved on' fields
available in traditional file systems. Thus, detailed policies can be
applied to objects for more efficient and automated management. Without
NFS or CIFS to serve up files to applications, object-based storage
systems need to replace that layer of abstraction between raw blocks of
data on disk and files that applications can recognize. Today's
object-based systems use standard APIs such as Representational State
Transfer (REST) and Simple Object Access Protocol (SOAP), or
proprietary APIs to tell applications how to store and retrieve object
IDs.
One of our key decisions when we designed Mezeo was the adoption of object-based architecture for cloud storage. Mezeo can use traditional file systems as object based systems to deliver cloud storage, and can also expose cloud storage as a traditional file system (even though it has objects underneath the covers, or as an object system). This reflects our view that there will be a prolonged period of co-existence followed by a migration to object based systems.
If you'd like to learn more about how Mezeo offers an agnostic storage services platform for storage service providers (SSP), take a look at this paper (registration required) by the same Robin Harris: Building a scalable shared file infrastructure. The paper gives service providers an introduction to:
Cloud storage applications and customer drivers
Mezeo's storage architecture and options
Basic shared file storage reference designs
In the paper, Harris says that there are multiple ways to build highly scalable storage for cloud storage applications. He tells us how SSPs can differentiate their offerings:
The Mezeo platform allows the special features of the storage to be delivered to customers, while giving SSPs a powerful platform on which to build a business. Understanding what storage choices will better meet target market needs is a critical success factor. SSPs can differentiate their cloud services by careful selection of back end storage systems. The Mezeo platform gives SSPs great flexibility. Understanding how to use that flexibility will be key to growing a successful cloud storage service business.
Harris also presents five reference configurations (see diagrams below) in the paper, which vary in performance, availability, scalability, self-management and, of course, cost.
As the industry announcements on Cloud Storage APIs keep coming, the confusion surrounding what they mean keeps growing.
We have the Amazon S3 APIs, Eucalyptus APIs, Rackspace Cloud Files APIs, Mezeo APIs, Nivanix APIs, Simple Cloud API, along with the standards proposed by the Storage Networking Industry Association (SNIA) Cloud Storage Technical Work Group, and more.
So what should you do or think about all this? What impact do these Cloud Storage APIs have on your decision-making? Just how important are they, and what's next?
Here's some information to aid your understanding of this emerging and important technology. Let's begin by answering two basic questions:
What is a Cloud Storage Application Programming Interface (API)?
A Cloud Storage Application Programming Interface (API) a method for access to and utilization of a cloud storage system. The most common of these are REST(REpresentational State Transfer) although there are others, which are based on SOAP (Simple Object Access Protocol). All of these are associated with establishing requests for service via the Internet.
What is REST? REST is a concept introduced in the doctoral dissertation of Roy Fielding, and is widely recognized as an approach to "quality" scalable API design. The actual API design and capabilities are very dependent on the actual capabilities of the underlying Cloud Storage System
One of the most important REST capabilities is that it is a "stateless" architecture. This means that everything needed to complete the request to the storage cloud is contained in the request, so that a session between the requestor and the storage cloud is not required. Why is this important? The Internet is highly latent (it has an unpredictable response time and it is generally not particularly fast (when compared to a local area network (lan)). Once you get a request, there is no guarantee that you can ask a "qualifying question" of the requestor in a reasonable time period. So, REST is an approach that has very high affinity to the way the Internet works. Traditional file storage access methods that use NFS (network files system) or CIFS (Common Internet File System) do not work over the Internet, because of latency.
One other thing we should clear up: Cloud Storage is for files, which some refer to as objects, and others call unstructured data. Think about the "files" stored on your PC, like pictures, spreadsheets and documents. These have an extraordinary variability, thus "unstructured". The other kind of data is "block" or "structured" data. Think data base data, data that feeds transactional system that require a certain "guaranteed" or low-latency performance. Cloud Storage is not for this use case. IDC estimates that approximately 70% of the machine stored data in the world is unstructured, and this is also the fastest growing data type.
So, Cloud Storage is storage for files that is easily accessed via the Internet. This does not mean you cannot access Cloud Storage on a private network or LAN, which may also provide access to a storage cloud by other approaches, like NFS or CIFS. It does mean that the primary and preferred access is by a REST API. (Here are other terms you will see, RESTful, or RESTlike or RESTstyle, which is geekspeak for how closely the API conforms to the REST approach.)
Today, there are multiple definitions for Cloud Storage, and the one I prefer is "File Storage accessed through Web Services API's over a network". This represents the key attributes of file storage that is cloud storage, versus other types of file storage. Other key qualities of a storage cloud are:
multi-tenant support (use by more than one unrelated user)
geo location and geo replication, seamless and real time provisioning of accounts
seamless and real time provisioning of accounts
availability of "practically" unlimited amounts of storage "on-demand"
"pay for use", which means that your payment is for actual storage used, over some time frame, usually a month.
There are many who are still arguing about what I have defined above, but what I've said is generally accepted by the industry. If it is a vendor doing the arguing I would suggest you check under their hood, usually you will find that they do not offer whichever of the above features they are trying to argue out of the definition.
Also, traditional storage vendors continue to proclaim the importance of local network access (like NFS, CIFS or ISCSI) for the purpose of Cloud Storage access by applications that today can only access via the older protocols. This requires that the application making the request be on the same local network (think same data center) as the storage cloud. Their reason for this view is that they are only just beginning to see application demand for storage cloud access via REST APIs, versus their traditional business model which serves an enterprise user with their own data center.
This is why Cloud Storage has generally emerged as a service offering in the IT Service Provider (also know as the WEB Hosting Industry) space first. In this space, there is no doubting the importance and future of REST API access to storage clouds, it is only viewed as an adoption speed issue. Note that within the data center, access to storage using an HTTP based protocol is not necessarily any slower than one of the more traditional protocols. API access has been labeled as being a slower form of access over NFS and CIFS. This view is largely due to the fact that it "may" be accessed over the Internet. In most cases, it is the network that adds the latency, not the means of access. Make no mistake, traditional storage vendors see this coming, and they will make offerings available in the near future.
REST APIs are language neutral and therefore can be leveraged, very easily, by developers using any development language they choose. Resources within the system may be acted on through a URL. So, an API is not a "programming language" it is the way a programming language is used to access a storage cloud. This is part of the basic understanding of APIs that is required to discuss the dreaded "vendor lock in" and upcoming "cloud lock in" discussions and understand the issues that surround these assertions.
REST APIs are also about changing the state of resource through representations of those resources. They are not about calling web service methods in a functional sense. The key differences between different Cloud Storage APIs are the URLs defining the resources and the format of the representations.
The Cloud Storage space is very young and everyone has their opinions on how things should be represented and accessed. Efforts are underway by organizations like SNIA, with their Cloud Data Management Interface (CDMI), to standardize both the resource structure and the representations. However, standards are not developed overnight and customers are demanding programmatic access to Cloud Storage now.
Current Cloud Storage vendors have produced a basic set of APIs that are accomplishing fairly similar things, and other APIs that expose the underlying unique functionality of the Cloud Storage platform supplying the storage cloud. You should expect that, over time, most storage clouds will provide the basic functions in somewhat similar ways, and further that additional advanced functions will be adopted and expected to be in every storage cloud offering.
Finally, you should look for a taxonomy of APIs, that includes basic file functions, advanced functions, Provisioning APIs, Billing APIs, and Management APIs. Storage clouds that become successful will offer all these capabilities, to increase the efficiency of their use.
Several efforts have been made to simplify the transition between vendors by providing an abstraction layer on top of the vendor's APIs. In this approach, a program library is created, for use in the application that needs cloud storage access, and this API translates (for the given program language) a single API into the API that is specific to a Cloud Storage offering. So, the application, which is using this library, writes their APIs once, and achieves portability between storage clouds that are supported by this approach.
This approach has been largely programming language specific and may take advantage of the language it was designed for. Good examples of this are jClouds, an open source cloud storage abstraction library written in Java, and Simple Cloud API, a collaboration of vendors including Microsoft, Rackspace, Nirvanix, IBM and Zend which provides a simplified Cloud Storage interface for PHP developers. While extremely useful for developers, these abstractions tend to expose the lowest common denominator relating to Cloud Storage functionality and may omit critical features, for example only providing namespace object access as opposed to ID access.
So, let's discuss lock-in, the term used to express concern that once a vendor has gotten you to exploit their architecture and technology, they will recognize that you are committed to them and cannot easily move away. As a result, they will then raise their prices and take advantage of your lock in status, keeping their price just below the amount that would encourage conversion away from their technology and towards a more "open" set of capabilities. Let's look at all the "dreaded" examples that have been surfaced around cloud storage and as a reason to slow it's adoption:
1. API lock in, which means your interaction with a storage cloud uses the APIs of that storage cloud, and suggests that you cannot easily move to another providers cloud with their own, different APIs.
2. Vendor lock in, which means that since you are condemned because of your application development activity with specific APIs to use only a cloud from a specific supplier.
3. Device lock in, meaning that you developed a cloud storage based program utilizing the APIs of that specific cloud, for a specific device (generally a PDA) that has specific functionality. This is double lock in, both the device programming methodology and the API selection.
4. Browser lock in, meaning that programming to specific APIs can also be rendered unique based on the Web browser that is selected.
5. Programming language lock in, which means that you have written the APIs in a language like Python, or JAVA, or .NET, or whatever.
6. API wrapper lock in, which means that you incorporated libraries into your application that allows your application to write generic APIs, which are then translated by these APIs to the correct API for the desired storage cloud (this is what Simple Cloud API is).
So, as you can see here, utilizing cloud storage could ultimately have you locked in on at least six levels!
With this much opportunity for vendor abuse, why are developers rushing to write Web based applications that utilize cloud storage services via API access? Are they simply uncontrolled, unthinking rebels who will shortly learn the error of their ways? Have they made a fatal error? Or do they know something you don't?
First, learn about Cloud Storage APIs. What they do is make storage programmable, and they abstract storage from the application. They offer advanced functionality (the programmable word) that makes it faster and easier to write the applications that are scalable versus the traditional storage access approaches. When you add these two capabilities to the storage cloud offering of low cost, availability in multiple locations, seamless provisioning, ease of adding additional storage, and the pay for use model, the case for the cloud has become compelling.
Where are we seeing early adoption: at service providers, because they host Web based applications and SaaS (usually Web based) applications, and this is where the developers who recognize the opportunity are focused.
What is coming: the introduction of this technology into the enterprise, complete with the adoption of the RESTful API technology. This will ultimately lead to a level of cooperation between service providers and the enterprise that has long been predicted. Enterprises will move to an IT modeled on an OPEX model, and expect their applications to be provisioned and interacting with service provider clouds, via APIs. IT Service Providers are racing to build the clouds to provide for this emerging business opportunity.
So, what about the lock in mentioned above. Sit down with your developer, they will show you why they don't feel "locked in". They will show you that you can quickly recraft your current APIs, in the programming language of your choice, to utilize the new APIs of the desired cloud. For this reason, Simple Cloud API will likely be a short term measure, which precedes base case APIs that are extremely similar, and goes through a market led process to identify "best practice" APIs for both base case and advanced function, as well as all the other API led capabilities as mentioned above. In short, vendor lock in is not the problem for this technology that it has been for others. Also, the ingenuity and resourcefulness of all the suppliers, standards groups, and market adoption scenarios will continue to mute your ability to be lock in free.
Your real challenge is not lock -in, but rather how to adopt this new set of capabilities, and solve problems and create opportunities with your IT solutions as rapidly as possible. Standing on the sidelines waiting for this one to resolve will keep you out of a great opportunity, because we still have several meaningful years of rapid change associated with this technology adoption cycle.