The Value of NetApp Dedupe in a Microsoft Virtual World
We don’t normally have guest bloggers on our site, but when we saw this piece that our friend over at Microsoft, Matt McSpirit had written, it was to good for us not to blatantly plagiarise!
We do a lot of work here with NetApp as a storage vendor and Microsoft as a solution vendor and over the last 18 months or so, we have seen the increased integration and development of NetApp’s value to the Microsoft virtualisation stack, we have seen some good success with our clients and have been involved in some excellent projects with them using these technologies.
However nice to see it backed by someone with Matt’s experience and someone who has no vested interest in NetApp storage above and beyond any other storage vendor, in fact in his role, he has to be agnostic.
Anyway enjoy the post… if you do want to know more about Matt or Microsoft’s virtualisation solutions then click here to check out his excellent BLOG
so here’s Matt’s BLOG, word for word…
I’ve been a big fan of NetApp technologies for ages, and I’ve worked closely with people like Steve Winfield, and Pete Mason, to produce a number of videos showcasing some of the collaborative work that’s gone on between Microsoft and NetApp, resulting in products like SnapManager for Hyper-V, SnapDrive 6.2 and more. We’ve got some fantastic joint wins on the platform now too, at both small, and large customers, so it’s all good from that perspective.
I’m currently building out my team’s internal demo infrastructure, which currently consists of 1 Dell T605, with Hyper-V R2, and a number of System Center technologies virtualised on top, along with a cluster of 2 Dell R710’s, hooked up to a NetApp FAS3050c. Now this FAS3050c isn’t the latest model, and it doesn’t have the most capacity in the world (my DS14 Disk shelf gives me around 570GB of usable space) but then it was kindly donated to me by NetApp, who were replacing some of their older kit, with newer kit for our Microsoft Technology Center, in Reading, UK. The great thing for me is, I can still have the latest version of OnTap, it’ll work with the latest and greatest versions of SnapDrive, and SnapManager for Hyper-V, and it still gives me all the features I need, like the snapshotting, thin provisioning, and best of all, deduplication. I’ll be honest with you right now. I love dedupe. I think it’s fantastically clever, streamlined, and because it’s at the block-level, rather than the file level, it’ll even dedupe stuff that you think, on the surface, has no chance of being deduped. Crazy stuff. Let me explain more.
Firstly, for those of you not sure what deduplication with NetApp is, and how it works, there’s a great explanation over at the Dr DeDupe blog.
As I said, my cluster environment is 2 Nodes, and to that cluster, I’m presenting 4 LUNs of storage, which in my NetApp environment, are in 4 separate Volumes. You don’t have to do it like this, and who knows, maybe I’ll change it in the future, but right now, this is how it is:

As you can see, I’ve got a dedicated LUN for my witness disk, (I’m using Node and Disk Majority for my 2-node cluster), and 3 LUNs presented to the cluster, which have been selected to be Cluster Shared Volumes. They aren’t huge, 100GB each for two of them, and a 25GB CSV that will hold the swap files of my key VMs (Each host only has 12GB RAM, so having 25GB for SWAP VHD’s is fine!) You’ll see from the image above, that currently, I’m using around 51% of my CSV2. It’s currently got a 40GB (ish) Fixed VHD with WS2008 R2 inside, but at the same time, CSV2 also has another Dynamic VHD, with Windows 7 x86 inside it, currently expanded to around 8GB. Total consumption of that CSV is 51GB:

So, that means I’ll lose 51GB on my SAN, right? Wrong! We’re actually using a grand total of 17.5GB!
If we go over to NetApp System Manager, and take a look at this particular volume, you can see for yourself:

Just think about this for a minute. Due to the fact that this is block-level deduplication, we can look inside the contents of the VHD files etc, and see where the blocks match, and deduplicate them, so in this case, we’ve saved a grand total of 37.62GB, which amounts to 60%. Obviously Windows still thinks it’s using 51GB, even though, under the covers, the SAN hasn’t lost that space. This is where Thin Provisioning starts to help, as you can make Windows think it has more storage available to it.
This use of deduplication hasn’t just been used on my CSV’s. Oh no. I’ve used it on the Witness disk, where, even though the whole volume is only 1GB, and the consumption was 50MB for the quorum information, deduplication still managed to save me 10mb, which is 20%. What about my other savings? Well, on my SCVMM Library, where I’m storing a couple of VHDs, but also some ISO files, I’ve saved a total of 15%, and on my actual backup store, being used by Data Protection Manager 2010, to protect Hyper-V and SQL so far, I’m saving just under 39GB, which equates to 58%. These savings are real, and are enabling me to get even greater levels of consolidation on my SAN than I would have normally. Brilliant stuff NetApp.
Now I just need to get ApplianceWatch PRO working…