About oversubscription
In the airline industry, they have this concept of "load management" - and besides figuring out how to charge us last minute travelers at quadruple the rate of leisure travelers, they have also gotten pretty good at oversubscribing seats.
Certain features of server (and storage) virtualization allow us to not only oversubscribe our resources, we can do it without offering a $300 travel voucher when we're oversold. What we can do is analogous to having two or more people sit in the same seat at the same time (comfortably), or force one person to give up part of his/her body that isn't being used, to make room for someone else.
Oversubscription, basically, is when the sum of all allocated resources is greater than what is actually available. In the case of memory, for example, it means that you may have 20 VMs, each with 1 GB of allocated memory (for a total of 20 GB), but consume only 10 GB of physical memory.
Oversubscribing Memory
Memory oversubscription (or overcommit) in a hypervisor can come from four main sources:
- Powered-off VMs - many of our VMs may be "transient" and not always powered on. VMware Lab Manager is one example that makes heavy use of transient VMs.
- Transparent Page Sharing - this is unique to VMware, and is a low-overhead way of oversubscribing memory. Common pages (or zero-pages) in VMs are stored in physical memory only once.
- Balloon Driver - another VMware technology, built into VMware Tools. The balloon driver "tricks" a VM into giving up memory that it doesn't actually need.
- Swap - data is taken out of physical memory, and sent to disk storage. Swap isn't necessarily a bad thing: for example, certain parts of a VM are used once, and never accessed after boot time.
So, let's get to the financials:
The above spreadsheet shows the cost-per-VM difference between VMware ESX Server 3.x (with the console OS), 3i (without the console OS), and a hypervisor without memory overcommit capabilities (e.g. Xen-based or Microsoft HyperV). It turns out that the difference in software license costs is more than outweighed by the memory requirement per VM, and the cost per VM is 60% that of a non-oversubscribed host.
To validate this against your own VI3 environment, go to the "Hosts and Clusters" view in Virtual Center, select a host, and go to the Performance tab. Click "Change Chart Options...", and pick "Memory...Real Time...". The "Memory Granted" (sum of all memory that the VMs "think" they have) divided by the "Memory Consumed" (actual physical memory used by host) gives you a rough idea of the memory overcommit rate.
Two white papers from VMware and Kingston, here, and here, give some more detail on memory overcommit.
Oversubscribing Storage
Storage is another resource that can be oversubscribed. There are three main technologies that can accomplish storage oversubscription:
- Linked clones
- This feature is available in VMware Lab Manager and VMware Workstation at the virtual disk level. When a linked clone is used, the new VM uses pointers to the original VM for all common data.
- The additional advantage of linked clones is that whitespace is not stored - for example if an empty data disk is part of a clone operation, the new disk will act as a "thin" disk and only consume the storage that it really requires for data
- Linked clones can also be accomplished at the datastore level using technologies such as NetApp FlexClone (useful when cloning many VMs at once)
- Keep in mind: linked clones pay a performance penalty on write operations (using copy-on-write), and put added stress on the source disks on read operations
- Thin Disks
- Thin-provisioned disks are virtual disks that "appear" to the VM as one size, but only consume up to the amount of data that is required by that disk. So, a 10 GB drive that is 50% utilized will only store 5 GB on disk (a traditional "thick" virtual disk would consume the entire 10 GB on disk)
- Thin disks are options in VMware Workstation, and are the default disk type when using NFS storage in VMware ESX Server - however, VMs cloned from templates are always thick
- Storage vendors such as Hitachi and NetApp have LUN-level thin provisioning, but that would only apply to VMware if using RDMs
- Deduplication
- Deduplication is a technology similar to memory page sharing (above), where common data is stored only once. It is done "after the fact" (ex poste), meaning de-duplication opportunities are scanned using a background process
- Deduplication is primarily used for backups (e.g. Symantec PureDisk, EMC Avamar, or Quantum DXi-Series), but can also be used on the filesystem itself (today, using NetApp Deduplication, formerly A-SIS)
The following table summarizes some of the cost savings available with storage oversubscription. I have ignored tape backup savings due to de-duplication, and have only focused on online disk storage (NFS, iSCSI, or Fibre Channel).
Summary
We've looked at two main oversubscription opportunities (memory and storage), and shown how the use of common technologies for sharing and/or thin provisioning of those resources can reduce the unit cost per VM.
Other resources, such as CPU, bandwidth, and people, can be oversubscribed as well. We don't have the same de-duplication or thin provisioning options with those resources, but we can still use the airline-like approach of load management (in other words, make intelligent assumptions about how many applications will be busy at the same time).


Great analysis.
Reply to this
Good analysis. There's no arguing that oversubscription is extremely attractive from a cost savings standpoint.
Unfortunately there are potential consequences to oversubscription (real and perceived) that can and oftentimes do prevent this model from being implemented. This is especially true in organizations that don't place an emphasis on capacity management and planning.
It is my experience that successful adoption of an oversubscription model requires a mindset change and a commitment to the technologies from both the IT and Business leaders. These hurdles are often more difficult to overcome than any technical challenge.
Reply to this
Actually your post on Deduplication isn’t correct. Deduplication isn’t always a “background” process. In NetApp’s case it is a post-processing service.
Also, around thin provisioning, this isn’t exactly correct either. If thin-provisioning is available on a storage device it will impact volume directly mounted to guests (as is the case with iSCSI volumes mounted within Windows). Also, just because you enable “Thin Disk” functionality within VMware, that doesn’t mean that the storage device will see it as “thin”. However, if you chose to implement it both within VMWare and with the storage device, then there can be significant physical storage savings.
Regards,
Steven Schwartz
thesantechnologist.com
Reply to this
Steven,
I'm not sure what you're saying isn't correct about NetApp deduplication. It does occur after the data has been written, but the process that scans for duplicate blocks is a process that runs in the background. NetApp said as such: http://partners.netapp.com/go/techontap/top10_a-sis.html.
As for your comment about thin disks, we may be confusing virtual disks with physical disks. A virtual disk can be implemented as "thin", and it doesn't matter what the underlying storage device is. It's a file, and it sits on a filesystem (e.g. VMFS, or in the case of hosted products like Workstation - NTFS, Ext3fs, etc.)
If you are implementing thin provisioning at the LUN level (e.g. mounting iSCSI LUNs directly into the Windows guests, or use RDMs), then your comment holds true.
Reply to this
Upon reflection, the first part of your comment is definitely true. Deduplication isn't always a background process - e.g. for NetApp flexclones, the data is deduplicated as part of the cloning process instead of in the background. Backup deduplication typically happens before the data is sent to the backup server, so not a "background" process either.
Of course, the implementation doesn't matter so much when you're talking about the business benefits: oversubscription of resources can happen in many ways (in the case of disk, it's deduplication and thin provisioning), and oversubscription is an effective way of reducing capital cost.
Reply to this