Archive for the ‘Uncategorized’ Category

The API is not the Cloud

Wednesday, June 1st, 2011

“If we’re not using the APIs, we aren’t using the cloud.” That’s what he said, and that’s what I can convince myself to believe every other Tuesday. (That’s when I like to exercise my self-skepticism.)

What’s the context? I was involved in a discussion with a prospect implementing a private cloud, and he was dead-set against using the cloud either as a file server or as block storage. And as I said, I can see his point of view, every once in a while. After all, block storage protocols like SCSI or ATA are somewhat limited in scope, and web APIs offer interesting capabilities, like tagging and versioning.

But the more I think about it, the more that I realize that the provocative declaration “REST API = cloud” is only telling half of the cloud story. Sure, it’s got the more public history to it; after all, the cloud only began to emerge as a buzz-worthy concept after big Web vendors like Amazon started providing programmatic interfaces. And there’s still plenty of confusion and uncertainty about what the term “cloud” even means, so putting a stake in the ground seems like it might be a valid way to break from the quaint old ways. Plugging a computer into a disk drive! That’s so 1990’s! ¹

In truth, the notions that the cloud embodies arose from two different camps, with two fundamentally separate sets of concerns. On the one side, you have the cloud as a natural extension of the WWW: a new way for people, communities, and companies to share information and work alongside each other, thousands of miles apart. That’s the kind of cloud that’s exemplified by companies like DropBox. On the other side, you have the cloud as a natural extension of the utility computing environment: an elastic resource that grows when you need it, shrinks when you don’t, and leverages the economies of scale to drastically reduce overheads. The best exemplar of the utility storage cloud, I’d say, is us.

There is a substantial distance between these two camps. Issues that are a minor concern for sharers (communitarians?) are absolutely unacceptable to utilitarians. Likewise, when we take security as a primary design concern, we deliberately lop off whole branches of possible sharing features. Sure, we’re working on ways to leverage in new and novel ways more of the massively scalable back-end that the cloud provides, but we always view the goal as “plug it into the wall and forget about it”. You don’t have to buy a new generator every time you get a new dishwasher; why should you need a new array every time you get a new server?

I won’t spend much time rehashing the differences between sharers and utilitarians. There are valid reasons to integrate both models into businesses and personal lives. There’s no reason they can’t coexist and even cooperate. But to declare either model as the definitive cloud is short-sighted and a little silly.

Except on every other Tuesday.

¹Here’s an easy way to tell that you’re talking to an old storage guy: we don’t plug disk drives into computers, we plug computers into disk drives.

Video: What’s New in CloudArray 2.5

Tuesday, May 24th, 2011

We recently added a new video tour on our website that describes all that’s new in our 2.5 release of CloudArray.  CloudArray version 2.5 extends our software with exciting new features and adds two new appliances that round out the top of our model line.

Watch the video below to see all the details:

…and don’t forget to visit our CloudArray page to learn more.

Cloud storage: Why the whole is greater than the sum of parts, Part II

Friday, May 20th, 2011

Nicos was giving a talk recently, and made a statement along these lines: “With the hybrid data center, the whole is greater than the sum of parts. If your data is stored locally and augmented with cloud storage, then you’ve got a system that’s much more reliable than either one on its own.” (He uses a lot more words in Part I of this series.)

Now, since I’m a geek (no, really, it’s true!) I can’t just take an adage at face value. And I’m thinking, “Well, of course not, if you’re talking about reliability. Really, the whole is much more dependent on the product of its parts.” And so I wrote down this equation:

parts_product

Where did this equation come from? ¹

In our hybrid data center, to lose a chunk of data, you’ve got to have simultaneous failures in both the local data center and in the cloud. If you’re using a cheap SATA drive locally, with something like a 99.9% uptime (three 9s), but you’re using a cloud provider that advertises 99.999999999% (eleven 9s) durability, what does the whole system look like? Plug each part into the equation, and you get 99.999999999999% (fourteen 9s) durability: much better than the sum of its parts.

That equation also points to a reason that I am uncomfortable with the marketing category of “cloud gateway”. A true gateway product wouldn’t have the same multiplicative effect. At best, it would simply pass on the reliability of the cloud, but at worst it would introduce new failure dependencies. The whole actually would be the sum of its parts:

parts_sum

The hybrid data center, with local and cloud storage working together to ensure greater data availability and durability, is the target that we’ve always been working towards with CloudArray, even if that’s not what we’ve always called it.

And you know what’s really cool? That’s what we’ve done.

¹ Where did this equation come from? Well, we’re used to talking about uptime and reliability in terms of 9s, e.g. “this array has five 9s of reliability, with 99.999% uptime!” That is, at its heart, a probabilistic statement, and means that the probability that the array will not experience a failure is .99999, and the converse is that the probability of failure is 0.00001.

Now, one of the principles of probability theory is this: if you have two independent events that might occur, and you want to measure the probability that either one happened, you add them together. But on the other hand, if you want to measure the probability that both events happened, you’ve got to multiply them.

Your data can be safe even if a cloud provider evaporates

Tuesday, May 3rd, 2011

I ’m sure you’ve seen the recent negative press on cloud providers.  There have been several newsworthy events in the past few weeks and I wanted to take the opportunity, in my inaugural post on the TwinStrata blog, to add a bit of perspective.

Perhaps overlooked in all the discussions of the Amazon outage was the fact that Amazon’s Cloud storage services (S3 and RRS) were not affected, no data loss, no customer impact.  One of the fundamental characteristics of S3 and RRS, that which makes it such a viable target for offsite storage, is that there are multiple, geographically disbursed, copies of the data.  A single site outage, such as the one in Virginia, isn’t sufficient to bring the entire system offline.  The bottom line, for cloud storage users like TwinStrata customers, is that the redundancy of cloud storage helped them weather the outage without a hint of a failure.

Lucas Mearian stated in :”What happens to data when your cloud provider evaporates” that there is no way to directly migrate customer data from the cloud in the case of a major outage. While the statement regarding “directly migrate customer data” is true there are products available that can do so indirectly. While it may not be possible to migrate data directly from Vendor A to Vendor B it is certainly possible to move that same data via an intermediary.  By using an intermediary the whole point regarding data outages and data mobility becomes moot since you do have the ability to select multiple providers and/or move data from provider to provider or location to location.

Cloud is just the latest delivery model for outsourcing.  Thus the term “CloudSourcing” , which I rather like.  There’s no magic here, folks.  It’s just a delivery model.  Yes, it’s a Utility Model but in IT we’ve been talking about a utility model for close to a decade.  A rose by any other name…  Grid, virtualization, outsourcing…  Cloud is evolutionary, NOT revolutionary, and it’s here to stay.  The trick is to keep your eye on the shifting sands.  Use the technology where it makes sense and don’t use it where it doesn’t.  There are many compelling reasons for using Cloud technology, regardless of how you define “cloud”, and I would argue that it is a question of when, not if, you will adopt it.

It’s inevitable that there’s fallout in the Cloud space.  Remember what I said about shifting sands?  There’s always fallout when technology is changing and, in IT, change is a constant.  This is a good thing: a sort of technology Darwinism.  It’s a simple fact that there will always be casualties in an evolutionary model.  That does not mean the sky is falling.  It means that another competitor did a better job or didn’t make some particular mistake.  It means that the survivor is stronger than the fallen, which is good for you.

Hype and exaggeration can be bothersome, but the overall message is a good one.  The Cloud is maturing.  The Utility delivery model is becoming reality.  Instead of being slaves to our equipment, we in IT can spend a bit more time solving the business problems our equipment and services are designed to address.  Instead of worrying about how much storage I need to allocate to this LUN, how much memory and how many processors do I need for this production server; instead of worrying if a server I build today will last three years; instead of all that, I can focus on getting the job done.  This is the value that cloud brings to your business.

Cloud Storage Arithmetic: 80% Faster Recovery

Thursday, March 31st, 2011

The topics of ROI and cost savings frequently come up when discussing moving a portion of data storage infrastructure to the cloud. While it is good practice to compare the costs of purchasing dedicated infrastructure versus the pay-as-you-go cloud model to find the “break-even point” when the initial costs of investing in the cloud are replaced by month-over-month cost savings,  the math really does not speak to the operational improvement businesses can experience from moving to cloud storage.

Let’s take our recent customer case study, where AFGE, the largest federal employee union in the United States, was able to reduce VMWare off-site restore times by 80% versus tape using a combination of CloudArray software and Veeam Backup software.  How exactly should a business value such a substantial operational improvement in off-site backup? Well, here’s how AFGE values it:

“Now that we know the backup data is successfully and securely going offsite, we can rest easily. With the deployment of CloudArray, we were able to cut our storage costs, reduce data recovery times from one week to one day, and eliminate much of the manual work of handling all those tapes resulting in a savings of one quarter of a FTE,” said Taylor Higley, IT director, AFGE.

As you can see, there is a lot to be said about improvements in operational efficiency. While cost savings is always a relevant consideration in moving to the cloud, so is peace of mind in protecting/recovering valuable data.

Find out how your business can benefit from improved IT operational efficiency. Learn more about CloudArray.

How to Back Up Your Cloud Storage

Tuesday, March 8th, 2011

Protecting your data in the cloud with a perspective on the Gmail outage

Recently, Howard Marks posted an article on Network Computing entitled “Can Cloud Snapshots Replace Backup?” Howard’s key implication in the article is that cloud storage gateways, onramps or enablers that reside on-premise can replace conventional backup with snapshots for primary data stored in the cloud. However, the article leaves open a point of exposure around the reliability of the cloud storage provider if both the primary copy of data and snapshots are in the cloud. In that case, the storage provider is responsible for the safekeeping of both your primary and backup data.

Let’s take a closer look at what may possibly go wrong with a cloud storage provider and your primary data in the cloud.

  1. For starters, a provider could experience data loss. Better providers offer redundancy and multi-datacenter replication that reduce or eliminate risk of disk or data center failures. For instance, Amazon claims 99.999999999% durability of data stored on S3.  Assuming the cloud provider you choose maintains best practices around redundancy and data protection, the odds of data loss due to hardware of site failures are rather low.
  2. The more likely scenario is that a provider suffers an outage which prevents data access. To be fair, better providers offer 99.9% or even 99.99% availability, but none offer a 100% guarantee against outages, so  a cloud provider outage could range from a temporary annoyance to a painful business outage. More 9’s of availability help, but also keep in mind your network availability is also a limiting factor.
  3. A third, though less likely, case is that the cloud storage provider goes out of business. Again this would depend highly on the quality of the storage provider, which may leave users a migration path, such as when EMC Atmos Online stopped supporting production customers. Whether or not there is a migration path, this case should be part of any risk analysis when it comes to storing business critical data in the cloud.
  4. Finally, the world is still buzzing from thousands of Gmail users losing access to email accounts in late February, that was blamed on a bad software update. The words of Google vice president of engineering and site reliability Ben Treynor are quite telling on how Google is recovering these customers: “To protect your information from these unusual bugs, we also back it up to tape. Since the tapes are offline, they’re protected from such software bugs.” The  sobering message is that just because data is stored in the cloud, it is not exempt from software bugs or even human error, and may still require an offline storage tier for recovery.

Although snapshots in the cloud can functionally replace backup, how can we address the paranoid system administrators who are now white-knuckled at the thought of the 4 aforementioned risks? SLAs or a refund of monthly fees may be a small consolation in the face of lost data.

Is there a foolproof way to eliminate any business continuity dependency on the cloud provider? It turns out there are in fact a few:

  • Keep a local data copy. This is almost too simple, but some cloud storage enablers allow you to keep full local copies of your data along with snapshots in the cloud. Some, like CloudArray, even allow you to grow a local cache to ensure entire data sets reside on-premise. Additional local storage tends to be cheap, while off-site storage is very expensive, making this a compelling way to continue operations even if a cloud provider suffers an outage.
  • Use multiple cloud storage providers. Some cloud storage enablers support multiple cloud storage providers. If so, replicating across providers is relatively simple, albeit at double the capacity and bandwidth costs. The additional cost may be a bit much to swallow, reducing the viability of this alternative.
  • Periodically copy data/snapshots off-cloud. This may sound familiar to the backup administrator. Recently, Marcel van den Berg published design guidelines for Veeam backup and stated “As a number one rule I would advise to store the backup on a different storage platform than the storage platform which is being protected.” The notion of where a storage platform in the cloud begins or ends is admittedly vague. It’s not clear that storing backups in the same cloud observes this fundamental rule. Enter “off-cloud” copies or backups which can reside on-premise or on a separate cloud provider and the problem can be solved. The notion of having primary data in the cloud and backups local may sound strange at first, but is actually a very viable option that is supported by cloud storage enablers like CloudArray and provides protection against both on-premise and cloud disasters.

With these alternatives, it is viable to store primary data in the cloud while minimizing risks of cloud provider failures, software reliability and even human error.

No data center is perfect, which is what makes off-site backup and disaster recovery (DR) planning a requirement for most organizations. Likewise, storing primary data in the cloud is no exemption from proper DR planning and best practices.

Bottom line: If you are considering using a cloud storage enabler, gateway or onramp to store primary data in the cloud, consider working with a product that minimizes your risk and a company that has already thought through and addressed the possible risks and DR contingency options.

Do you use cloud snapshots for backup? Let us know…

6 Key Features of Cloud Storage Gateways (On-ramps or Enablers)

Monday, February 28th, 2011

Are you considering cloud storage for your business?  There are many reasons you should.  Using innovative cloud technology, IT is solving data storage problems in new ways. Whether it’s for off-site data protection, disaster recovery or just storage capacity expansion, the pay-as-you go model pioneered by a number of cloud storage providers can be very compelling.

Rather than use cloud storage directly by writing to custom APIs, building your own security policies and architecting a performance framework to meet application needs, you may find that on-premise cloud storage software or hardware (i.e. gateways, on-ramps, enablers) make integration simpler.  Purchasing a product that handles security, performance, data reduction and plug-and-play integration can significantly accelerate and simplify deployment.

With a handful of gateway products already on the market that can connect your on-premise environment to cloud storage, a natural question may (or should) be “what is the difference between these products?” above and beyond the aforementioned functionality.

To answer this, we’ve put together a list of 6 differentiating features you should consider when choosing a cloud storage gateway:

1) Dynamic caching policies to meet application needs:  A monolithic cloud storage cache may not be able to handle the performance needs for all applications. A backup application may benefit from a cache consisting of low-cost storage optimized for large sequential access, while an NTFS file system may benefit more from an SSD-based cache, optimized for smaller, more randomized access. Each application may require more or less cache over time. Having application-specific caching policies that are dynamic means you can meet needs of different applications using a single solution.

2) Option to replicate a local copy to the cloud: Some vendors argue that having a full local copy defeats the purpose of cloud storage – not at all true! Imagine replacing a real-time replicated secondary site requiring hardware, infrastructure and maintenance costs with a pay-as-you-go cloud! Or imagine not having a secondary site to begin with and now finding a 2-site replication solution within easy reach. This is a very compelling business proposition, particularly for transactional applications that require a full local copy for latency reasons.

3) In-cloud snapshots: Snapshots are rapidly becoming a key part of modernized backup and, when using the cloud, it is important to find out whether a gateway solution offers snapshots. If yes, are the snapshots copy-on-write and on-premise, meaning potential bandwidth thrashing between the local site and the cloud? Or are the snapshots in-cloud, redirect-on write, meaning no bandwidth penalty or performance penalty and readily available in case of disaster? If you have the option of the latter, you may have gathered that it is far superior.

4) Block and file-level access: It’s amazing to hear arguments from vendors trying to convince users that file access is better than block access for cloud gateways. The reality is that there are advantages to file access and advantages to block access. Supporting both means supporting that widest variety of operating systems, file systems and applications; and there is no longer any argument. Hint: having native block access (like iSCSI) means you can support both.

5) “Zero-friction” entry point to cloud storage: Deploying cloud storage should not mean continuing to spend additional CapEx/OpEx associated with traditional storage infrastructure and incurring the same 3-yr upgrade cycles.  Sure, there are advantages to optimized hardware appliances for accessing cloud storage, but only when needs and budget dictate. A choice of software, hardware and subscription models  with upgrade paths between each are the ideal way to start using cloud storage with minimal risk/cost and the ability to grow.

6) In-cloud disaster recovery and Compute-Anywhere capability: Once your data is in the cloud, you can access it anywhere, but how about in the cloud? Why not be able to leverage unlimited pay-as-you-go cloud compute cycles for disaster recovery or test. Beyond disaster recovery, your data or snapshots of data can and should “work” for you in the cloud. You can even leverage Big Data without dedicated processing resources by using cloud compute. Think about a vision of a hybrid data center and how this capability can enhance IT.

In summary, all cloud gateways, on-ramps, or enablers are not equal and it takes looking beyond  the similarities in features to understand whether they will meet the needs specific to your environment and grow to meet your future needs. It pays to look under the covers before purchasing…

Perhaps you have found a cloud storage solution that has all of these features. If you haven’t, we suggest you consider a cloud storage solution that does….