Hey guys! I was sitting here today, minding my own business… when the following tweet showed up in one of my search columns! (Why yes I do search on NetApp, and every major vendor in the Industry that I know a real lot about, I like to stay topical! oh and RT job opportunities… I know peoples ;))
So I thought “Well Hey! I’d like to understand the challenges associated with NetApp’s Deduplication! Let’s get down to business!”
I click the little link which takes me to THIS PAGE where I fill out a form to receive my “Complimentary White Paper” ooh, yay! And let me tell you, other than the abusive form (Oh lovely… who makes people fill out FORMS for content.. yea I know, I know..) this thing looked pretty damn sweet! FYI: By sweet, I mean it looks so professional, so nice, like a solid Marketing Group got their hands on this and prettified it! I mean look at it!
Tell me that doesn’t look damn professional! Hell, I’d even at first pass with NO knowledge, take everything contained within that document at face value as the truth, I mean cmon let’s cover the facts here.
- This whitepaper looks SWEET! It’s all logo’d out and everything too!
- It’s only 8 pages; that speaks of SOLID content including not only text, but pictures and CITING evidence! Sweet right?!
- And you said it; right there on the first page is says “BUSINESS WHITE PAPER” Tell me that does not spell PRO all over it.
So what I’m thinking is, clearly this has been vetted by a set of experts who have validated the data and ensured that it is correct; or at least within the context of the information consider the footer of this document claims to have been published January 2011. So this CLEARLY should be current.
Yea… No. Not Quite. Quite the opposite? I guess it may be time to explain though! But before I go there, Disclaimer time!
HP’s Disclaimer at the bottom of the document:
© Copyright 2011 Hewlett?Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.
My Disclaimer for what you’re about to read:
I do not work for HP and I have nothing against HP. I do not work for NetApp and have nothing against NetApp. Yea I work for EMC – Wait, aren’t you the competition?! WHY ARE YOU RAGGING ON HP FOR THEIR POORLY WRITTEN PAPER?! I think that falls in line because, when *I* Publish something attacking NetApp’s deduplication I do the homework and validate it (Except for when I quote external third parties… Yea I don’t do that anymore because… you end up with a mess like this document that HP has released ;)) OMG Seriously?! Seriously HP!? You’ve spurned me to write this because you upset my competitive nature. With that said, let’s get down to brass tacks. Secondary Disclaimer: I had forgotten I read this originally when this post came out HP Launches an Unprovoked Attack on NetApp Deduplication and you know what? between seeing it circulate AGAIN and having me fill out a form… yea Sean following bad data with bad data is #fail either way. Tertiary Disclaimer; a lot of the ‘concerns’ and ‘considerations’ addressed in the HP Paper which they’re claiming StoreOnce is the bees knees can solve, are actually readily solved with Industry Best of Breed Avamar and Data Domain, let alone leveraging VNX Deduplication and Compression, but I won’t go there because that is outside of the boundaries of this particular post :)
The paper has been broken down into the following sections; “Challenge #, blah blah blah, maybe cited evidence, Takeaway” I plan to… give you the gist of the paper without quoting it verbatim (that’s like the paper itself!) but also not removing the context, and sprinkling commentary and sarcasm as needed ;)
Challenge #1: Primary deduplication: Understanding the tradeoffs
This section has a lot of blah blah blah in it, but I’ll quote two areas which have CITED references;
While some may find this surprising given the continuing interest in optimization technologies, primary deduplication can impose some potentially significant performance penalties on the network.1
Primary data is random in nature. Deduplicating data leads to various data blocks being written to multiple places. NetApp’s WAFL file system exasperates the problem by writing to the free space nearest to the disk head. Reading the data involves recompiling these blocks into a format presentable to the application. This data reassembly overhead mandates a performance impact, commonly 20–50 percent.2
I particularly love this section for two reasons; one it’s VERY solid in its choice of words “can impose” not will impose, but it’s like “maybe?!?” This it not a game of “can” I have a cookie vs “may I have a cookie”, this is a white paper right? Give me some facts to work off of guys. Oh, I said two reasons didn’t I. Well, here is Reason #2 – Here’s the citing! [1 End Users Hesitate on Primary Deduplication; TheInfoPro TIP Insight, October 21, 2010] I’ll chalk up to the possibility that I am clearly an IDIOT but I was unable to find the “Source” of this data. So… soft language… inability to validate a point, sweet!
But wait, let me discuss the second citing for a second, yea let me do that. I won’t go into WTF they’re saying in how they’re citing this as this is not an extensive and deep analysis of how WAFL and Data ONTAP operate but I decided “Whoa excellent backing data! Let me checking out that citing shall I?!” So I go to the source [2 Evaluator Group, August, 2010] and I find… I can pay, $1999 to get this data! Excellent! First idea which came to mind, “I should write stupid papers and then sell the data at MASSIVELY high costs.. nah I’ll stick to factual blog posts” Yea, so I’m 0 for 2 in being able to “Validate” whatever these sources happen to be sharing, I’m sure you’ll be in the same boat too. Oh but the best part? Let’s take a moment and read the Take Away, shall we?!
Takeaway – Deduplication is often the wrong technology for data reduction of primary storage.
OMG SERIOUSLY? THAT IS SERIOUSLY YOUR TAKEAWAY?! It’s like a cake made up of layers of soft language, filled it with unverifiable sources. And it’s not like this is even very GOOD FUD, it’s just so… Ahh!!!!!! A number of us (non-netappians) got so pissed off when we read this, I mean SERIOUSLY?!?
Relax.. Relax, it can’t get any worse than that right?
Challenge #2: Fixed vs. variable chunking
Wow this reads like an advertisement for Avamar. But seriously, this for the most part only discusses the differences between Fixed and Variable chunking, more educational than anything. Not a whole lot for me to discuss other than noting the similarities in their message to the Industry Leading Avamar.
Takeaway – Using variable chunking allows HP StoreOnce D2D solutions to provide a more intelligent and effective approach for deduplication.
Wow Christopher, you’re getting tame.. you let them slide on that one!
Challenge #3: Performance issues and high deduplication ratios
NetApp suffers performance issues with high deduplication ratios; something NetApp engineers said on a post to the NetApp technical forum.3
NetApp is so concerned about the performance of their deduplication technology that Chris Cummings, senior director of data protection solutions for NetApp told CRN customers must acknowledge the “chance of performance degradation when implementing the technology” should they turn on the technology.4
Okay, sweet! Let’s rock this out! Not only do they have CITED sources of this data (You know I love it when I have data to refer to!) but they even provide embedded links so I can click to go directly to the data! (WOOHOO!) And like any good detective… I did visit those links. It was upon visiting those two links that two things came back to me. “Hmm, Chris Cummings quote from 2008. Hmm, Forum conversation from 2009…” … Yea I was still AT NetApp during those two periods, OMG SERIOUSLY HP YOU’RE QUOTING DATA FROM 3 OR MORE YEARS AGO?!?! How can you NOT expect me to put that in caps? Let’s take a little journey down almost ANY product or dev company for a moment… I’d like to visit VMware in this particular scenario.
“VMware is great for Virtualization applications, Oh, but not Mission Critical Applications, it’s not stable for that. Do not virtualize mission critical applications”. Yea. you can almost QUOTE me as having said that. When would I might have said that? Maybe when VMware had GSX out (Pre-ESX days) and our computers were run with the power of Potatoes. Yea, if you have NO dev cycle and you do not invest in development [Oh no you didn’t make a slighted attack on the MSA/EVA! … No I didn’t ;)] But if you STOP development all things we’re discussing can absolutely be true! #WeirdAnecdoteOver
So, while I firmly agree in 2008 and 2009 there WERE Performance concerns the likes of which were discussed in those forums. Very viable, Deduplication in general was maturing, I’m sure every product out there had similar problems (Data Domain which scales based upon CPU – with 4 year old CPUs probably couldn’t perform as well as it can today with our super Nehelem’s etc) You need to realize it is 2011, we’re in an entirely new decade. Please stop quoting “Where’s the beef” or making “Hanging Chad” references like Ted Mosby in How I met your mother because while true at the time, not so applicable today.
Takeaway – HP typically finds 95 percent duplicate data in backup and deduplicates the data without impacting performance on the primary array.
I almost forgot the takeaway! (Hey! I’m verbose… You should know that by now!) So… what I’m hearing you say is… Because HP doesn’t have a native Primary Storage Deduplication solution like NetApp or EMC… there is no performance impact on the primary array! Hooray! Yea… WTF SEAN? I mean, I guess if I wanted I could repurpose most of this paper to position Avamar which seems a LOT more versatile than HP StoreOnce but okay, let’s move past!
I’m going to lump Challenge #4, #5 and #6 together because they have little to no place in this paper.
Challenge #4: One size fits all
Takeaway – Backup solutions are optimized for sequential data patterns and are purpose built. HP Converged Infrastructure delivers proven solutions. NetApp’s one?size?fits?all approach is ineffective in the backup and deduplication market.
Challenge #5: Backup applications and presentation
Takeaway – NetApp does not provide enough flexibility for today’s complex backup environments.
Challenge #6: Snapshots vs. backup
Takeaway – Snapshots are part of a data protection solution, but are incomplete by themselves. Long?term storage requirements are not addressed effectively by snapshots alone. HP Converged Infrastructure provides industry?leading solutions, including StoreOnce for disk?based deduplication for a complete data protection strategy.
I’m sorry, this is no contest and these points have absolutely no place in a paper educating on the merits and challenges of Deduplication with NetApp. This definitely has it’s place in a whole series of OTHER competitive and FUD based documents, but not here, not today.
Sean… (Yes I know your name!) You wrote this paper for HP right? As a Technologist and Technology Evangelist for that matter, I would absolutely LOVE to learn about the merits, the values, the benefits of what the HP StoreOnce D2D solution brings to market and can do to solve customers challenges. But honestly man, this paper? I COMPETE with NetApp and you pissed me off with your fud slinging. I know *I* can piss off the competition when I sling (FACTS) so just think about it. We’re a fairly small community, we all know each other for the most part. (If you’re at Interop in a few weeks, I’ll be at EMCWorld, feel free to txt me and we can meet up and I won’t attack you, I promise ;)) Educate, but please do not release this kind of trash into the community… Beautiful beautiful trash mind you I mean everything I said about how amazingly this was presented, honestly BEST WHITE PAPER EVER. But that has got to be some of the worst most invalid content I’ve encountered in my life. (As applicable to how I stated it :))
I guess I should add a little commercial so someone doesn’t go WTF – I mean what I said above not only about the technologies which were discussed. If you think StoreOnce is a great solution, then you’ll be floored by Avamar and Data Domain. They’re not best of breed in the industry without good reason.
Feel free to comment as appropriate, it’s possible this has been exhausted in the past but SERIOUSLY I don’t want to see this again. ;)
Step one you say we need to talk, He walks you say sit down it’s just a talk, He smiles politely back at you, You stare politely right on through.