By Greg Roody
In Part 1 of this posting, I discussed considerations around portability and security of a Cloud SAN software product. In this post, I’ll discuss the related issues of performance and consistency.
3. Performance and Consistency
Performance is one of the subjects I normally answer with the phrase “it depends”. You usually can’t get around it because there are so many different factors to consider, it’s hard to answer a simple question with a single answer. In cases involving replication, it gets even worse.
After you have compressed, deduped, and encrypted your local data, you still have to move it over a physical network to its eventual home. If you aren’t using a hybrid Cloud configuration, that can mean long latencies for your local application server. With no local cache, every write has to go out to the Cloud Service Provider, and an acknowledgement has to come back before the next write can occur. Reads would suffer a similar fate. In the worst case scenario, that could amount to hundreds of milli-seconds of lag time between i/o’s. That’s not very good for local application performance.
The solution to this problem is to make certain that your Cloud SAN software has a local cache to not only buffer reads and writes, but to also perform more advanced caching functions like LRU/MRU page management and pre-fetch. By storing the data which is most frequently used in a local cache, write (and read) performance is greatly improved. In fact, if the software supports a variable caching policy, the entire target volume can be cached locally, creating a second local copy of your data that is then asynchronously replicated to your Cloud Storage Provider. With variable caching, you can even establish QoS criteria for each volume you replicate.
Another consideration is what happens to performance when the network link to your remote Cloud Storage provider drops? In the case of a non-hybrid solution, or when you are only partially caching your writes, your applications will likely slow down significantly and eventually stop responding while they wait for the network connection to be restored. Applications might simply pause or they might timeout. Fully cached volumes would simply keep operating normally but the dirty page pool would continue to grow and the amount of data which would need to be replicated would also continue to grow.
By using fully cached volumes, you can actually achieve local performance levels for data writes that are being replicated to offsite storage.
So what about the consistency of all that data? How is consistency maintained during normal write operations and during bulk updates?
Remember that the local cache is storing all writes prior to them being asynchronously replicated to the cloud. However, unless the Cloud Storage pool itself is protected during these updates, it’s possible that some writes will either be received out of sequence due to network issues or that some data could be lost in transit. This can be accomplished in a number of ways, either by write ordering (which is data intensive) or by batching up the writes and sending them off as a consistent group, which will save bandwidth. In either case, it’s important that the Cloud SAN software manages the updates so they can assure the full ordered update or rollback incomplete or out of sequence updates. This capability should exist independently of the cache function within the Cloud SAN software (the Cloud copy must be protected against cache failure or network errors).
The level of consistency this provides is sometimes called “crash consistent”. There may be some small amount of data loss in the event of a catastrophic event (the data that was in transit or that is still resident in cache when the host is lost) but the Cloud resident data volume will be fully consistent.
So to summarize, two critical attributes of Cloud SAN software should be a variable local intelligent caching policy to boost performance and the ability to ensure consistency at all points in the data path; for local writes, updates to the cloud, and during recovery of failed updates.
Tags: Cloud Storage, CloudArray, consistency, performance



Del.icio.us
digg
Twitter
MySpace
FaceBook
reddit
Stumble Upon
[...] This post was mentioned on Twitter by Jen Sobuta. Jen Sobuta said: new blog post: 5 Considerations for Cloud SAN Software– Part 2 (performance and consistency) http://ow.ly/25QKd [...]