For organizations considering server virtualization, underlying storage infrastructure is a highly important component. While it's true that your chosen storage mechanism must have adequate space and be secure, there are also a number of other considerations specific to storage for server virtualization. This article will explore some of the more important factors.
Although not an absolute requirement, organizations looking to virtualize their servers should consider investing in shared storage, which is a storage pool accessible from multiple host servers. Shared storage is important because, at the present time, it is required for Windows Failover Clustering and for live migrations.
Live migration technology allows a virtual machine to be moved from one host server to another without taking the virtual machine offline. This benefits organizations such as hospitals that need to keep their servers online at all times, as it allows individual hosts to be taken offline for maintenance without having to take down the virtual machines in the process. The virtual machine can simply be moved to a different host for the duration of the maintenance.
It is worth noting that the version of Hyper-V that will ship with Windows Server 8 will allow live migrations without the requirement for shared storage. For the time being, however, both Hyper-V and VMware depend upon shared storage for live migrations (VMware calls its live migration feature Vmotion).
Look at requirements when considering shared storage for server virtualization
As previously mentioned, shared storage is disk storage that is simultaneously accessible to multiple hosts. In spite of this seemingly simple definition, there are typically a number of requirements around the shared storage architecture. Of course, the actual requirements vary depending upon which virtualization product you are working with.
The very nature of shared storage rules out using direct attached storage for server virtualization. Typically, a virtualization host that makes use of shared storage will have one internal hard disk that it uses to boot the hypervisor, but everything else is stored on shared storage.
As a general rule, network attached storage (NAS) is not suitable for use as shared storage either, because of the way connections are made to NAS devices.
Normally, servers connecting to NAS devices link to a file share using server message block (SMB). However, hypervisors generally require shared storage to appear as a local resource even if the storage is actually located on a remote server. Hence the only way to get away with using a NAS device is to find one that supports the iSCSI connectivity. Here iSCSI provides the illusion of dedicated storage.
Although iSCSI connectivity is perfectly adequate in smaller and medium-sized organizations, larger organizations typically require a storage infrastructure that offers better performance. This often means setting up a Storage Area Network and connecting the virtualization host servers to the SAN via Fibre Channel.
Maximizing performance of storage for server virtualization use
The type of network interface (iSCSI or Fibre Channel) used to connect to a storage device is not the only factor determining storage performance. The number of read and write operations that the storage pool can perform each second, usually referred to as IOPS, is a major determining factor of storage performance.
The best way to achieve a high number of IOPS is to use a RAID array consisting of as many disks as possible. For example, suppose that you needed to create a shared storage pool of 1 TB. In such a situation, you could achieve better performance by stripping the data across 10 different 100 GB drives than you could from using a single 1 TB drive. (Of course, this is a generalization. The actual level of performance that you will get from your storage array depends heavily on how the array is configured.)
In a virtual data center, storage performance has to be balanced with fault tolerance. Remember that your storage pool will be supporting numerous virtual machines. If the storage pool fails then all of your virtual machines will also fail.
The best type of RAID array to use in a virtualized environment is usually RAID 10, which consists of two mirrored stripe sets. This type of array offers the performance of a stripe set with the redundancy of mirroring.
The main advantage of thin provisioning storage for server virtualization is that it allows you to over commit your physical disk space.
RAID 5 can provide fault tolerance at a lower overall cost, but RAID 5 arrays do not perform as well as RADI 10 arrays because of the requirement for each disk in the array to store parity data.
Deploying thin provisioning storage for server virtualization environments
Thin provisioning is one more concept to be familiar with when planning storage for server virtualization. When you allocate storage space to a virtual machine, that storage space is said to either be thinly provisioned or thickly provisioned.
Thick provisioning can be thought of as reserving physical disk space for the virtual machine. For instance, if you use thick provisioning to create a 500 GB virtual hard disk, then the hypervisor will immediately claim 500 GB of physical disk space for use with the virtual machine.
When you use thin provisioning on a virtual machine, on the other hand, you are essentially specifying the maximum amount of disk space that the virtual hard drive should use.
For example, suppose that you thinly provision a virtual machine with 500 GB of disk space. The virtual machine will initially claim far less than 500 GB of disk space (usually less than 1 GB). The virtual machine will provide the illusion that it has 500 GB of disk space, but the actual amount of physical disk space that the virtual machine uses will start out very small and gradually expand as you add data to the virtual hard disk.
Thin provisioning and thick provisioning each have their advantages. The main advantage of thin provisioning storage for server virtualization is that it allows you to over commit your physical disk space. You can provision your virtual machines with as much space as you want, so long as the aggregate amount of space that is actually in use does not exceed your physical disk capacity. Of course the down side is that, if you over provision your physical disk resources, you could run the storage pool out of space unless you carefully monitor your disk usage.
Another advantage to thin provisioning is that thinly provisioned virtual hard disks can be created very quickly. In contrast, a thick provisioned virtual hard disk can take a while to create, as the hypervisor must claim all of the physical space that has been allocated to the virtual machine.
However, thick provisioning also has the distinct advantage of offering better performance than thinly provisioned storage. Using thick provisioning helps avoid physical disk fragmentation. This improves performance -- as does the fact that the virtual machine does not have to deal with the overhead associated with dynamically expanding the virtual hard disk file.
As you can see, there are a number of considerations that must be taken into account with regard to virtual machine storage. It is critically important to design storage infrastructure to provide high performance while also maintaining fault tolerance.
Brien M. Posey, MCSE, is a Microsoft Most Valuable Professional for his work with Windows 2000 Server and IIS. He has served as CIO for a nationwide chain of hospitals and was once in charge of IT security for Fort Knox. Write to him at firstname.lastname@example.org contact @SearchHealthIT on Twitter.