Thin provisioning storage is now considered mainstream. Every vendor offers thin provisioning as a part of the...
core functionality on its arrays. In fact, most vendors have taken thin provisioning to the next level by offering intelligence in their arrays. This intelligence can detect when files are deleted and when free space can be reclaimed.
By coupling thin provisioning with automatic storage tiering, online data deduplication and compression technologies, vendors are taking storage systems to unprecedented levels of optimizing data at rest. In fact, the aggressive adoption of server virtualization in data centers has in turn driven the adoption of all four of these technologies in tandem -- as companies in almost every industry seek to lower fully burdened, per-Gigabyte costs as much as they can.
Companies in the health care industry are faced with a peculiar dichotomy. On one hand, they are faced with explosive data growth driven by socio-political developments such as the federal HITECH Act and state data retention regulations. On the other hand, they are faced with ever increasing pressure to cut costs everywhere -- and their IT organizations are not spared from such measures.
A tiered storage architecture is one way for health care providers to reduce their per-Gigabyte storage costs. One can also assume that, as health care IT seeks to find ways to cut costs everywhere, it will be forced to aggressively adopt primary storage optimization technologies such as thin provisioning.
Thin provisioning addresses storage underutilization
At the face of it, thin provisioning seems to be a boon to IT organizations inundated with a constant barrage of storage provisioning requests, which can range from needing more storage for new systems to expanding storage for existing systems. Provision the storage and be done with it, as storage administrators would like to think.
The ability to over-allocate or over-provision storage is luring. Storage administrators will take anything to make storage provisioning simple, seamless and transparent, especially in virtualized environments. Thin provisioning allows storage to be allocated in large chunks without the waste that would have resulted had this provisioning been done in a standard "thick" manner.
Thin provisioning also masks another challenge that often plagues IT departments everywhere -- the systemic underutilization of allocated storage. Sloppy or suboptimal practices on the server side mean that file systems never reach healthy utilization levels (healthy as defined by most industry best practices).
Some experts say that, on average, the utilized to allocated ratio never exceeds 20%, after factoring in array and systems overhead. Others suggest it may be as high as 50%. Even so, that's still a huge waste, considering that storage is not cheap and that it is difficult to recover storage that is not being utilized on the server. Therefore, in places where storage chargeback is not implemented or the pain for having to pay for additional storage is not felt, asking for more storage is often an easier way out of this massive underutilization challenge.
Thin provisioning also shifts storage management burden
Enter thin provisioning. Storage administrators can now "reclaim" the unutilized storage on the servers. Even though the file system may only utilize 20% of the storage assigned to it, thin provisioning fools the system into thinking it really has 100% of the utilization capacity. This means that the remaining 80% is part of a free pool that is being shared by many other servers.
This magically makes globs of free space available to storage administrators, who can then use it to over-allocate storage to multiple servers, which may or may not all require that space at the same time. Similar to how banks handle deposits, this system is based on taking a calculated risk -- the risk being that the subscription ratio is maintained and monitored. More importantly, additional capacity is added in a timely manner to prevent a system overload.
Most storage departments are used to a "provision it and forget it" method of storage management. Health care is no different. Thin provisioning, though, forces storage administrators to take a more proactive management approach to make sure they don't run out of storage capacity.
Creating an enormous storage entity, for good or ill
Storage administrators will also reckon that the scale of impact when such an overload occurs is much wider and can be felt across the organization. Many customers choose to create thin pools that span several hundred Terabytes and service servers across multiple mission critical environments.
Moreover, unlike system administrators who implicitly know that additional storage is always a request away, storage administrators may sometimes have to go through lengthy requisition cycles that could sometimes take months to procure additional storage capacity.
With thin provisioning, storage administrators have to constantly monitor the capacity of the free pool and proactively add capacity before it reaches critical levels.
In addition, thin provisioning can overload storage frames from a performance perspective. In the traditional world of storage, physical storage resources such as disks and RAID groups (and the underlying input/output operations per second, or IOPS, they can supply) are often containerized, in the sense that these resources are shared by fewer consumers -- servers, volumes, file systems and so on. This means an imbalance on such storage resources is limited to only them and can be isolated relatively quickly by moving the offending consumers to other lesser utilized resources.
In the thin provisioning world, multiple physical storage resources are pooled to create one enormous entity, both from a physical capacity and performance perspective. All IOPS are treated in a shared manner -- that is, all "consumers" share a giant pool of IOPS and capacity.
Similar to the capacity argument, consumers that share this pool are expected to consume the pooled performance in a balanced manner. Even one rogue consumer -- a server, virtual machine or application using up a file system -- can cause a performance tidal wave that is felt by all of the other consumers sharing the pool.
Isolating the offending consumer is now even more challenging, unless storage administrators are equipped with tools that measure performance at the granular level and keep standby or reserve capacity where such "consumer" resources can be isolated. Most storage departments will shy away from keeping this capacity around, as it drives down storage utilization and increases storage costs.
Letting everyone into the thin storage pool
What these challenges imply is that, while thin provisioning is a promising proposition for maximizing storage utilization, it cannot be implemented in an ad-hoc manner. Deploying thin provisioning requires perhaps a more in-depth look at the capacity and performance requirements at the various consumers of storage, whether they are physical or virtual. Storage administrators cannot provision storage in a thin pool without examining its impact on the overall health of the pool. It forces them to examine the big picture, and take steps proactively to maintain the health of the overall environment.
Fortunately, vendors have vastly improved their storage monitoring, alerting and reporting capabilities, arming storage administrators with valuable information that can be used to both pre-emptively identify candidates for thin provisioning as well as manage them once they has been deployed on thin storage, taking necessary precautions in a timely manner. Built-in alerts, for example, can tell a storage administrator when a certain storage capacity threshold has been reached. The admin will not arrive at the office one morning to suddenly find that there's no water left in the pool.
Most applications in the health care industry play nicely with data storage optimization technologies such as thin provisioning. Databases in particular -- specifically those that drive the back end for clinical applications, electronic health record (EHR) management systems and patient management systems -- are worthy of deployed in a thin-aware manner.
Of course, the big daddies of all thin-aware applications are none other than server virtualization platforms. By deploying the Web and middleware tiers on virtual servers, which are in turn hosted on thinly provisioned (and deduplicated) storage, health care organizations can make solid gains in all layers of a multi-tier application stack.
Thin provisioning is all about identifying the problem of under-utilization of storage resources and deploying a solution to overcome this systemic issue. It is one of the cogs of primary storage optimization, along with automated storage tiering, deduplication and compression. Keeping these cogs well-oiled is one of the fundamental requirements for their deployment.
This point should hit close to home in the health care industry, which is constantly advocating that patients look at their health in the bigger picture by focusing more on prevention than post-incident treatment. After all, the prevention is often better than the cure.
Ashish Nadkarni is senior analyst and consultant at Taneja Group in Hopkinton, Mass. He has over 16 years of experience in IT infrastructure management, operations and consulting. Let us know what you think about the story; email firstname.lastname@example.org