What’s all the fuss about VDI Storage Sizing?

On our team at EMC, we spend a TON of time working with customers and partners to help them size out their VDI environments.  It’s a pretty well established fact that undersizing the storage for a VDI environement and not factoring in the performance (IOPS) is one of the most comment reasons why VDI projects fail to deliver the expected or intended user experience.  My teammates have blogged EXTENSIVELY on this in the past…check out www.myvirtualcloud.net for some great examples of this.

Where I’m heading though, is that for all of the effort that we spend working with customers to come to a design, we are typically asking the customer to provide a required level of performance for which to build our designs and configurations from.  Typically, the question goes like this:

  “How many IOPS do your users need at a steady state and peak?” 

 Then we followup with questions like…

 “How many users are logging in all at once, or are they spread out across the work day?”
 “What different types of users do you have, and what are the performance requirements for each of those user types?”
 “what agents are installed that would cause an increase in IO requirements, like AV or Inventory Software?”

These are all great questions, and hugely important…and help to shape the initial design.  But then when you ask a question like, “What’s your intended adoption timeline?” and get an answer of something like 3 years…you have to take every piece of information used for the intial design and wrap a huge “fudge factor” around the whole thing. 

By the time the customer gets around to implementing and migrating the users over to the new VDI infrastructure, will the performance requirements and attributes of those users be the same?  Application and OS updates, along with changes to how the users work from day to day is a constantly changing process.  How can we expect an infrastructure to anticipate these changes (unless we completely OVERSIZE the solution)

I feel we need to have a better dialog about a constant process of assessment and remediation.  This is an ITIL concept of continuous improvement.  The cycle of assess and remediate STARTS in the POC phase, continues into the Pilot Phase, and never stops in the Production Phase.  Customers who are working to create a design that is locked down for 3 years are potentially shooting themselves in the foot…if they actually attain the adoption rates that they are looking to acheive (which is often a bit agressive) when they get there, the performance requirements will most likely have changed. 

As new optimizations are coming out as well, such as vShield Endpoint offloads, that performance requirement could go down over time…but more than likely will continue to increase and the workload evolves. 

Let’s make sure our customers understand that VDI design is a journey (do we use that term too much) and not just a snapshot in th time at the beginning of the process.