Leading with the Unified EMC VNX – Watch, Look and Listen [Contest Entry too]

Hi guys! You’ve seen me produce videos in the past I’m sure, although it is not that often that *I* am actually IN the videos where you can see me, if even for a moment.  (You let me know if you like that, or not ;))

Through a series of events, I ended up recording this video.  For your reference: This IS a first draft, in an infinite style of regularly updating my style for delivery.  [This was done using a little camera and a tiny mount on not a whole lot of space on a whiteboard… if I went Pro for the next one… you’ll be all OMFG :)]

Initially, I had drawn up some notes of how I’d present this.. so in the effort of WTF and Full disclosure (You know me all too well ;)) These are those preliminary whiteboard/notes.

Unisphere, Silos, FAST-VP, FAST Cache Ease of Use, Maximum Efficiency, Consistent Performance

I did have some other pictures which showed bits and pieces.. but I’m sure you get the gist of it!  So what exactly am I saying with this video, with these whiteboards.. with this.. you know, whatever it is I’m doing here. :)

Three things actually.   One of the cool things about the VNX is that it provides you with:

  • Ease of Use – With simplicity of management using Unisphere, vCenter Integration, Single Pane for all arrays
  • Maximum Efficiency – With FAST-VP, You can go Thin, and use the right tier at the right time with auto-tiering
  • Consistent Performance – The magic of EFDs, QOS, FAST Cache, Wide-Striping, Consolidate and guarantee SLAs

I know what you’re saying “OMFG CHRISTOPHER, ARE YOU IN MARKETING?” No. And I usually stand by that, because I get things done ;).  Yes, I intentionally chose fitting points which happen to coincide with the acronym EMC, you know why? Because for one thing it’s easier to remember. :)  Not to mention, things should truly be easy to work with, and enjoyable; all the while providing an accurate picture of the story at hand!

Look forward to some future series where I actually go into the guts of the system and show you (from my perspective) the how’s and the why’s of these things [Time permitting ofcourse ;)]

So, for now, I leave you with this video.  Take it with a grain of salt, it IS a first-draft cutting room floor version.  As soon as I get my new mbp (April 1st?!) You should see some far better *Perhaps Annoying* videos coming from me ;)

And I know that some of you love me in videos sometimes… so for those of you who want to see me unplugged in an off-the-cuff impromptu interview (No, I’m not drunk), here is the infamous “I’m CXI, I’m from the Internets!” video!

I hope you enjoyed that.   Now, I’d love to hear your feedback. Where you say “Wow, I never thought of it that way” Or, “OMG You’re a total tool!” Yea, I’m game for either of those – Whatever you have to offer up, constructive or otherwise!

Oh, and I also submitted this for an internal contest (Which if I win.. will give me a sweet wrap-around terrace suite at EMC World [Yes, you can all come hang out in my room ;)] So, support away in every way possible! :)

Watch this space for more stupidity, err I mean great content in the future!

EMC Unisphere in your pocket! Announcing the UBER VSA 3! (Now with low sodium!)

You heard it here! Time to cut your blood pressure in half! (I apologize for those of you who already have low blood pressure.. this may put you over the top!)

UBERTastic : Celerra UBER VSA v3 – Unisphere -  Be sure to click this link here to get to the download links to pull down the OVA or Workstation version!

EMC Unisphere, now with less sodium!

So, Roxanne… what is new in this version of the VSA? As it appears that I’m practically stealing Nick’s entire post (which I’m cool with… ;))

  • DART is now 6.0.36.4
  • Unisphere management console (rocks!)
  • The Celerra VSA is now 64 bit! This means you can throw RAM at it for bigger setups and it will use it. Over 8GB becomes less beneficial without code changes to simulation services. Future updates will fix this from the Celerra VSA engineering teams.
  • The biggest and most difficult change to construct is that the configuration is now adaptive depending on the virtual machine setup. This version is now intelligent in seeing how many resources you have given it.
  • The new Celerra UBER VSA uses this intelligence to now allow *Thin* mode. If you give the VSA under 2GB of RAM it will automatically size the memory limits, processes, and management interface settings to allow it to run with as low as 1024MB of RAM. You won’t do replication or host a ton of VM’s but you can use this mode to host a few and fully demonstrate/test the new Unisphere interface on even a 2GB laptop.
  • The new VSA also uses this intelligence to automatically allow the configuration of single or dual Data Mover version based on the memory assigned. If you give the VSA more than 4GB of memory you will be given the option to enable an additional Data Mover for use as a standby or load balancing experimentation. This means this single appliance can be a small lightweight NFS unit at 1024MB of RAM or can be a 2 Data Mover powerhouse at 8GB of RAM. All automatically configured on first boot through the wizard.
  • Automatic VMDK/Storage additions have been adjusted for new 64 bit OS. This means this still works. Shutoff the VM, add VMDK(s), turn on and you have more space. Automagic
  • Since automagic is so cool, I have changed the Data Mover Ethernet binding to be automatic also. The VM starts with 1 interface for management and 1 interface for the Data Movers. If you want more for the DM(s), just shutoff the VM, add NIC cards (up to 6 additional), and turn back on. It will automatically bind the Data Mover (yes it works with the 2 DM mode also) to the new interfaces and virtual slots. Just go back into Unisphere and assign away. This allows scale up for the bigger 2 Data Mover 8GB of RAM versions easily.
  • Configuration is now Perl/Bash based instead of just Bash to keep things cleaner and slicker and allow for some coolness later on ;)
  • NTP from the configuration portion of the wizard works correctly. It sets both the Control Station and all Data Movers and enables NTP as a running service. Make sure your NTP server is valid.

    So let’s summarize:

    1. New Unisphere
    2. 64 Bit
    3. Automatic sizing
    4. Thin Mode
    5. Optional 2 Data Mover mode
    6. Automatic Data Mover Ethernet adding (along with fixed Storage [VMDK] adding)
    7. NTP works now

    Wow! That’s a whole lot! Where do I sign up to download?!?  UBERTastic : Celerra UBER VSA v3 – Unisphere – No signup required, just go click and download!  Because Nick has so many other vital details about the differences of THIS Uber VSA compared to Uber VSA’s in the past, I am referring you to his page so you can read the ‘technical’ details and stuff!  So go download the UBER VSA TODAY! (I am downloading it right now, literally.. )

    OMFG IM DOWNLOADING IT TOO

    I look forward to your feedback… and enjoyment of this tool, I know I’ve been waiting for some time for this myself!

    Data Longevity, VMware deduplication change over time, NetApp ASIS deterioration and EMC Guarantee

    Hey guys, the other day I was having a conversation with a friend of mine that went something like this.

    How did this all start you might say?!? Well, contrary to popular belief, I am a STAUNCH NetApp FUD dispeller.  What that means is, if I hear something said about NetApp by a competitor, peer, partner or customer which I feel is incorrect or just sounds interesting; I task it upon myself to prove/disprove it because well frankly… People still hit me up with NetApp questions all the time :) (And I’d like to make sure I’m supplying them with the most accurate and reflective data! – yea that’s it, and it has nothing to do with how much of a geek I am.. :))

    Well, in the defense of the video it didn’t go EXACTLY like that.   Here is a little background on how we got to where that video is today :)   I recently overheard someone say the following:

    What I hear over and over is that dedupe rates when using VMware deteriorate over time

    And my first response was “nuh uh!”, Well, maybe not my FIRST response.. but quickly followed by; “Let me try and get some foundational data”  because you know me… I like to blog about things and as a result collect way too much data to try to validate and understand and effectively say whatever I say accurately :)

    The first thing I did was engage several former NetApp folks who are as agnostic and objective as I am to get their thoughts on the matter (we were on the same page!)Data collection time!  

    For Data Collection… I talked to some good friends of mine regarding how their Dedupe savings have been over time because they were so excited when we first enabled it in the first place (And I was excited for them!)   This is where I learned some… frankly disturbing things (I did talk to numerous guys named Mike interestingly enough, and on the whole all of those who I talked with and their data they shared with me reflected similar findings)

    Disturbing things learned!

    Yea I’ve heard all the jibber jabber before usually touted as FUD that NetApp systems will deteriorate over time in general (whether it be Performance, whether it be Space Savings) etc etc. 

    Well some of the disturbing things learned actually coming from the field on real systems protecting real production data was:

    • Space Savings are GREAT, and will be absolutely amazing in the beginning! 70-90% is common… in the beginning. (Call this the POC and the burn-in period)
    • As that data starts to ‘change’ ever so slightly as you would expect your data to change (not sit static and RO) you’ll see your savings start to decrease, as much as 45% over a year
    • This figure is not NetApp’s fault.  Virtual machines (mainly what we’re discussing here) are not designed to stay uniformly the same no matter what in accordance to 4k blocks, so the very fact that they change is absolutely normal so this loss isn’t a catastrophe, it’s a fact of the longevity of data.
  1. Virtual Machine data which is optimal for deduplication typically amounts to 1-5% of the total storage in the datacenter.   In fact if we want to lie to ourselves or we have a specific use-case, we can pretend that it’s upwards of 10%, but not much more than that.  And this basically accounts for Operating System, Disk Image, blah blah blah – the normal type of data that you would dedupe in the first place.
    • I found that particularly disturbing because after reviewing the data from these numerous environments… I had the impression VMware data would account for much more!   I saw a 50TB SAN only have ~2TB of data residing in Data stores and of that only 23% of it was deduplicating (I was shocked!)
    • I was further shocked that when reviewing the data that over the course of a year on a 60TB SAN, this customer only found 12TB of data they could justify running the dedupe process against and of that they were seeing less than 3TB of ‘duplicate data’ coming in around 18% space savings over that 12TB.    The interesting bit is that the other 48TB of data just continued on un-affected by dedupe.   (Yes, I asked why don’t they try to dedupe it… and they did in the lab and, well it never made it into production)

    At this point, I was even more so concerned.   Concerned whether there was some truth to this whole NetApp starts really high in the beginning (Performance/IO way up there, certain datasets will have amazing dedupe ratios to start) etc. and then starts to drop off considerably over time, while the EMC equivalent system performs consistently the entire time.

    Warning! Warning Will Robinson!

    This is usually where klaxons and red lights would normally go off in my head.    If what my good friends (and customers) are telling me is accurate, it is that not only will my performance degrade just by merely using the system, but my space efficiency will deteriorate over time as well.    Sure we’ll get some deduplication, no doubt about that!  But the long term benefit isn’t any better than compression (as a friend of mine had commented on this whole ordeal)    With the many ways of trying to look at this and understand I discussed it with my friend Scott who had the following analogy and example to cite with this:

    The issue that I’ve seen is this:

    Since a VMDK is a container file, the nature of the data is a little different than a standard file like a word doc for example.

    Normally, if you take a standard windows C: – like on your laptop, every file is stored as 4K blocks.  However, unless the file is exactly divisible by 4K (which is rare), the last block has just a little bit of waste in it.  Doesn’t matter if this is a word doc, a PowerPoint, or a .dll in the \windows\system32 directory, they all have a little bit of waste at the end of that last block.

    When converted to a VMDK file, the files are all smashed together because inside the container file, we don’t have to keep that 4K boundary.  Kind of like sliding a bunch of books together on a book shelf eliminating the wasted space.  Now this is one of the cool things about VMware that makes the virtual disk more space efficient than a physical disk – so this is a good thing.

    So, when you have a VMDK and you clone it – let’s say create 100 copies and then do a block based dedupe – you’ll get a 99% dedupe rate across those virtual disks.  That’s great – initially.  Netapp tends to calculate this “savings” into their proposals and tell customers that require 10TB of storage, that they can just buy 5TB and dedupe and then they’ll have plenty of space.

    What happens is, that after buying ½ the storage they really needed the dedupe rate starts to break down. Here’s why:

    When you start running the VMs and adding things like service packs or patches for example – well that process doesn’t always add files to the end of the vmdk.  It often deletes files from the middle, beginning, end and then  replaces them with other files etc.  What happens then is that the bits shift a little to the left and the right – breaking the block boundaries. Imagine adding and removing books of different sizes from the shelf and making sure there’s no wasted space between them.

    If you did a file per file scan on the virtual disk (Say a windows C: drive), you might have exactly the same data within the vmdk, however since the blocks don’t line up, the block based dedupe which is fixed at 4K sees different data and therefore the dedupe rate breaks down.

    A sliding window technology (like what Avamar does ) would solve this problem, but today ASIS is fixed at 4K. 

    Thoughts?

    If you have particular thoughts about what Scott shared there, feel free to comment and I’ll make sure he reads this as well; but this raises some interesting questions.   

    We’ve covered numerous things in here, and I’ve done everything I can to avoid discussing the guarantees I feel like I’ve talked about to death (linked below) so addressing what we’ve discussed:

    • I’m seeing on average 20% of a customers data which merits deduping and of that I’m seeing anywhere from 10-20% space saved across that 20%
    • Translation: 100TB of data, 20TB is worth deduping reclaiming about 4TB of space in total; thus on this conservative estimate you’d get about 4-5% space saved!
    • Translation: When you have a 20TB data warehouse and you go to dedupe it (You won’t) you’ll see no space gained, with a 100% cost across it.
    • With the EMC Unified Storage Guarantee, that same 20TB data warehouse will be covered by the 20% more efficient guarantee (Well, EVERY data type is covered without caveat)   [It’s almost like it’s a shill, but it really bears repeating because frankly this is earth shattering and worth discussing with your TC or whoever]

    For more great information on EMC’s 20% Unified Storage Guarantee – check out these links (and other articles I’ve written on the subject as well!)

    EMC Unified Storage is 20% more efficient Guaranteed

    I won’t subject you to it, especially because it is over 7 minutes long, but here is a semi funny (my family does NOT find it funny!) video about EMCs Unified Storage Guarantee and making a comparison to NetApp’s Guarantee.   Various comments included in the description of the video – Don’t worry if you never watch it… I won’t hold it against you ;)

    Be safe out there, the data jungle is a vicious one!   If you need any help driving truth out of your EMC or NetApp folks feel free to reach out and I’ll do what I can :)

    SPOILERS!!!

    Read More

    EMC didn’t invent Unified Storage; They Perfected it

    Hi Guys! Remember me! I’m apparently the one who upset some of you, enlightened others; and the rest of you.. well, you drove a lot of traffic here to get my blog to even beat out EMC’s main website as the primary source for information on "Unified Storage" (And for that, I appreciate it :))

    In case any of you forgot some of those "target" posts, here they are for your reference! but I’m not here to start a fight! I’m here to educate and to direct my focus on not what this previously OVERLY discussed Unified Storage Guarantee was or is, but instead to drive down in to what Unified Storage will really bring to bear.   So, without further adieu!

    What is Unified Storage?

    I’ve seen a lot of definitions of what it is, quite frankly a lot of stupid definitions too. (My GOD I hate stupid definitions!)  But what does it mean when you Unify to you and me?   I could go on and on about the various ‘definitions’ of what it really is (and I even started WRITING that portion of it!) but instead I’m going to scrap all of that so I do not end up on my own list of ‘stupid definitions’ and instead will define Unified Storage at it’s simplest terms.

    A unified storage system merges NAS and SAN. Optimized for performance and interoperability, the system simultaneously stores both file data and blocks of application data in virtually any operating environment

    You can put your own take and spin on it, but at it’s guts that is seemingly what the basics of a "Unified Storage" system are; nothing special about it, NAS and SAN (hey, lots of people do that right?!)  You bet they do!   And this is by no way the definitive definition on what “Unified Storage” is, and frankly that is not my concern either.   So taking things to the next level; now that we have a baseline of what it takes to ‘get the job done’, now it’s time to evaluate the Cost of Living in a Unified Storage environment.

    Unified Storage Architecture Cost of Living

    I get it.  No really I do.   And I’m sure by now you’re tired of the conversation of ‘uniqueness’ focused on the following core areas:

      • Support for Mixed Clients
      • Support for multiple types (tiers) of disk
      • Simplified Provisioning
      • Thin Provisioning
      • Improving Utilization

    All of these items are simply a FACT and an expectation when it comes to a Unified Platform.  (Forget unified, a platform in general)   Lack of support of multiple tiers, locking down to a single client, complicated provisioning which can only be done fat which makes you lose out on utilization and likely is a waste of time – That my friend is the cost of living.    You’re not going to introduce a wasteful fat obsolete system and frankly, I’m not sure of any (many) vendors who are actually delivering services which don’t meet on multiple of these criteria; So the question I’m asking is… Why do we continue to discuss these points?   I do not go to a car dealership and say “You know, I’m expecting a transmission in this car, you have a transmission right?”  And feel free to replace transmission with tires and other things you just flat out EXPECT.    It’s time to take the conversation to the next level though; because if you’ve ever talked to me you know how I feel about storage. “There is no inherent value of storage in and of itself without context or application.”   Thus… You don’t want spinning rust just for the sake to have it spin, no you want it to store something for you, and it is with that you need to invest in Perfection.

    Unified Storage Perfection

    What exactly is the idea of Unified Storage Perfection?   It is an epic nirvana whereby we shift from traditional thinking and takes NAS and SAN out of the business of merely rusty spindles and enable and engage the business to earn its keep.

    Enterprise Flash Disks

    Still storage, yet sexy in it’s own right.  Why?  First of all, it’s FAST OMFG FLASH IS SO FAST! And second of all, it’s not spinning, so it’s not annoying like the latest and greatest SAS, ATA or FC disk!    But what makes this particular implementation of EFD far sexier than simple consumer grade SSD’s is the fact that these things will guarantee you a consistent speed and latency through and through.   I mean, sure it’s nice that these things can take the sheer number of FC disks you’d need to run an aggressive SQL server configuration and optimize the system to perform, but it goes beyond that.   

    Fully Automated Storage Tiering (FAST)

    Think back to that high performance SQL workload you had a moment ago, there might come a time in the life of the business where your performance needs change; Nirvana comes a knocking and with the power of FAST enables you to dynamically, non-disruptively move from one tier of Storage (EFD, FC, SATA) to another, so you are guaranteed not only investment protection but scalability which grows and shrinks as your business does.    Gone are the days of ‘buy for what we might use one day’ and welcome are the days of Dynamic and Scalable business.

    FAST Cache

    Wow, is this the triple whammy or what?  Building upon the previous two points, this realm of Perfection is able to take the performance and speed of Enterprise Flash Disks and the concept of tiering your disks to let you use those same existing EFD disks to extend your READ and WRITE cache on your array!    FAST Cache accelerates performance to address unexpected workload spikes. FAST and FAST Cache are a powerful combination, unmatched in the industry, that provides optimal performance at the lowest possible cost.  (Yes I copied that from a marketing thingie, but it’s true and is soooooo cool!) 

    FAST + FAST Cache = Unified Storage Performance Nirvana

    So, let’s put some common sense on this then, because this is no joke, nor is it marketing BS.    You assign EFD’s to a specific workload you want to guarantee a certain speed and a certain response time (Win).    You have unpredictable workloads who may need to be fast some times, but may be slow other times on quarterly of yearly basis’s, so you leverage FAST to move that data around, but that’s your friend when you can PREDICT what is going to happen.    What about when it is slow most of the time, but then on June 29th you make a major announcement that you were not expecting to hit as hard as it did, and BAM! Your system goes in the tank because data sitting on FC or SATA couldn’t handle the load.   Hello FAST Cache, how I love you so.     Don’t get me wrong, I absolutely LOVE EFD’s and I wish all of my data could sit on them (At home a lot of it does ;)) and I have massive desire for FAST because I CAN move my workload around based upon predictable or planned patterns (Marry me!)  But FAST Cache is my superman, because he is there to save the day when I least expected it, he caches my reads when BOOM I didn’t know it was coming, but more importantly he holds my massive load of WRITES which come in JUST as unexpectedly.   So for you naysayers or just confused ones who wonder why you’d have one vs the other (vs) the other; Hopefully this example use-case is valuable.   Think about it in terms of your business, you could get away with one or the other, or all three… Either way, you’re a winner.

    Block Data Compression

    EMC is further advancing its storage efficiency innovation as the first storage provider to introduce block data compression, by allowing customers to compress inactive data and reclaiming valuable storage capacity— data footprints can be reduced by up to 50 percent. A common use case would be compressing inactive data once EMC FAST software has moved that data to the most cost-effective storage tier. Block data compression joins EMC’s existing capabilities, including thin provisioning and data deduplication, to automatically and transparently maximize storage utilization.

    Yea, I DID copy that verbatim from a Press Release – And do you know why? Because it’s right! Even addresses a pretty compelling use-case too!   So think about it a moment.  Does this apply to you?  I’d never compress ALL of my data (reminisces back to the days of DoubleSpace where let’s just say, for any of us who lived it… those were interesting times ;)) But think about the volume of data which you have sitting on Primary Storage which is inactive and otherwise wasting space when it continues sitting un-accessed and consuming maximum capacity!  But this is more than just about that data type, unlike some solutions this it not an all or nothing.

    Think if you could choose to compress on demand! Compress say… your virtual machine right out of vCenter! But wait there’s more!   And there’s so much more to say on this, let alone the things which are coming.. I don’t want to reveal what is coming, so I’ll let Mark Twomey do it where he did it here:  Storage Services for Clariion Storage Pool LUNs

    What does all of this mean for me and Unified Storage?!

    Whoa, hey now! What do you mean what does all of this mean?! Are you cutting me short?  Yes.  Yes I am. :)   There are some cool things coming, which I cannot talk about yet… and not to mention some of all of the new stuff coming in Q3 – But things I was talking about… that’s stuff I can talk about –TODAY- there’s only even better things and cake coming tomorrow :)

    I can fill this with videos, decks, resources, references, Unisphere and every thing under the sun (You let me know if you really want that.. I’ve done that in the past as well)  But ideally, I want you to make your own decision, come to your own conclusions..  What does this mean for you?   Stop asking “What is Unified Storage” and start asking “What value can my business derive from technologies in order to save money, save time, save waste!”    I’ll try to avoid writing yet another article on this subject unless you so demand it! I look forward to all of your comments and feedback! :)

    EMC 20% Unified Storage Guarantee: Final Reprise

    Hi! You might remember me from such blog posts as: EMC 20% Unified Storage Guarantee !EXPOSED! and the informational EMC Unified Storage Capacity Calculator – The Tutorial! – Well, here I’d like to bring to you the final word on this matter! (Well, my final word.. I’m sure well after I’m no longer discussing this… You will be, which is cool, I love you guys and your collaboration!)

    Disclaimer: I am in no way saying I am the voice of EMC, nor am I assuming that Mike Richardson is infact the voice of NetApp, but I know we’re both loud, so our voices are heard regardless :)

    So on to the meat of the ‘argument’ so to speak (That’d be some kind of vegan meat substitute being that I’m vegan!)

    EMC Unified Storage Guarantee

    Unified Storage Guarantee - EMC Unified Storage is 20% more efficient. Guaranteed.

    I find it’d be useful if I quote the text of the EMC Guarantee, and then as appropriate drill down into each selected section in our comparable review on this subject.

    It’s easy to be efficient with EMC.

    EMC® unified storage brings efficiency to a whole new level. We’ve even created a capacity calculator so you can configure efficiency results for yourself. You’ll discover that EMC requires 20% less raw capacity to achieve your unified storage needs. This translates to superior storage efficiency when compared to other unified storage arrays—even those utilizing their own documented best practices.

    If we’re not more efficient, we’ll match the shortfall

    If for some unlikely reason the capacity calculator does not demonstrate that EMC is 20% more efficient, we’ll match the shortfall with additional storage. That’s how confident we are.

    The guarantee to end all guarantees

    Storage efficiency is one of EMC’s fundamental strengths. Even though our competitors try to match it by altering their systems, turning off options, changing defaults or tweaking configurations—no amount of adjustments can counter the EMC unified storage advantage.

    Here’s the nitty-gritty, for you nitty-gritty types
    • The 20% guarantee is for EMC unified storage (file and block—at least 20% of each)
    • It’s based on out-of-the-box best practices
    • There’s no need to compromise availability to achieve efficiency
    • There are no caveats on types of data you must use
    • There’s no need to auto-delete snapshots to get results

    This guarantee is based on standard out-of-the-box configurations. Let us show you how to configure your unified storage to get even more efficiency. Try our capacity calculator today.

    Okay, now that we have THAT part out of the way.. What does this mean? Why am I stating the obvious (so to speak)  Let’s drill this down to the discussions at hand.

    The 20% guarantee is for EMC unified storage (file and block—at least 20% of each)

    This is relatively straight-forward.  It simply says “Build a Unified Configuration – which is Unified” SAN is SAN, NAS is NAS, but when you combine them together you get a Unified Configuration! – Not much to read in to that.  Just that you’re likely to see the benefit of 20% or greater in a Unified scenario, than you are in a comparable SAN or NAS only scenario.

    It’s based on out-of-the-box best practices

    I cannot stress this enough.   Out-Of-Box Best practices.   What does that mean?    Universally, I can build a configuration which will say to this “20% efficiency guarantee” Muhahah! Look what I did! I made this configuration which CLEARLY is less than 20%! Even going into the negative percentile! I AM CHAMPION GIVE ME DISK NOW!".   Absolutely.  I’ve seen it, and heard it touted (Hey, even humor me as I discuss a specific use-case which me and Mike Richardson have recently discussed.)    But building a one-off configuration which makes your numbers appear ‘more right’ v using your company subscribed best practices (and out of box configurations) is what is being proposed here.   If it weren’t for best practices we’d have R0 configurations spread across every workload, with every feature and function under the sun disabled to say ‘look what I can doo!”

    So, I feel it is important to put this matter to bed (because so many people have been losing their time and sleep over this debate and consideration)  I will take this liberty to quote from a recent blog post by Mike Richardson – Playing to Lose, Hoping to Win: EMC’s Latest Guarantee (Part 2)    In this article written by Mike he did some –great- analysis.  We’re talking champion.  He went through and used the calculator, built out use-cases and raid groups, really gave it a good and solid run through (which I appreciate!)   He was extremely honest, forthright and open and communicative about his experience, configuration and building this out with the customer in mind.   To tell you the truth, Mike truly inspired me to follow-up with this final reprise.

    Reading through Mike’s article I would like to quote (in context) the following from it:

    NetApp Usable Capacity in 20+2 breakdown

    The configuration I recommend is to the left.  With 450GB FC drives, the maximum drive count you can have in a 32bit aggr is 44.  This divides evenly into 2 raidgroups of 20+2.  I am usually comfortable recommending between 16 and 22 RG size, although NetApp supports FC raidgroup sizes up to 28 disks.  Starting with the same amount of total disks (168 – 3 un-needed spares), the remaining disks are split into 8 RAID DP raidgroups. After subtracting an additional 138GB for the root volumes, the total usable capacity for either NAS or SAN is just under 52TB.

    I love that Mike was able to share this image from the Internal NetApp calculator tool (It’s really useful to build out RG configurations) and it gives a great breakdown of disk usage.

    For the sake of argument for those who cannot make it out from the picture, what Mike has presented here is a 22 disk RAID-DP RG (20+2 disks – Made up of 168 FC450 disks with 7 spares) I’d also like to note that snapshot reserve has been changed from the default of 20% to 0% in the case of this example.

    Being I do not have access to the calculator tool which Mike used, I used my own spreadsheet run calculator which more or less confirms what Mike’s tool is saying to be absolutely true!   But this got me thinking!    (Oh no! Don’t start thinking on me now!)    And I was curious.   Hey, sure this deviates from best practices a bit, right? But BP’s change at times, right?

    So being that I rarely like to have opinions of my own, and instead like to base it on historical evidence founded factually and referenced in others… I sent the following txt message to various people I know (Some Former Netappians’s, some close friends who manage large scale enterprise NetApp accounts, etc (etc is for the protection of those I asked ;))

    The TXT Message was: “Would you ever create a 20+2 FC RG with netapp?”

    That seems pretty straight forward.   Right? Here is a verbatim summation of the responses I received.

    • Sorry, I forgot about this email.  To be brief, NO.
    • “It depends, I know (customer removed) did 28, 16 is the biggest I would do”
    • I would never think to do that… unless it came as a suggestion from NetApp for some perfemance reasons… (I blame txting for typo’s ;))
    • Nope we never use more then 16
    • Well rebuild times would be huge.

    So, sure this is a small sampling (of the responses I received) but I notice a resonating pattern there.   The resounding response is a NO.   But wait, what does that have to do with a hole in the wall?   Like Mike said, NetApp can do RG sizes of up to 28 disks.   Also absolutely 100% accurate, and in a small number of use-cases I have found situations in which people have exceeded 16 disk RG’s.   So, I decided to do a little research and see what the community has said on this matter of RG sizes. (This happened out of trying to find a Raid6 RG Rebuild Guide – I failed)

    I found a few articles I’d like to reference here:

    • Raid Group size 8, 16, 28?

      • According to the resiliency guide Page 11:

        NetApp recommends using the default RAID group sizes when using RAID-DP.

      • Eugene makes some good points here –

        • All disks in an aggregate are supposed to participate in IO operations.  There is a performance penalty during reconstruction as well as risks; "smaller" RG sizes are meant to minimize both.

        • There is a maximum number of data disks that can contribute space to an aggregate for a 16TB aggregate composed entirely of a give disk size, so I’ve seen RG sizes deviate from the recommended based on that factor (You don’t want/need a RG of 2 data+2parity just to add 2 more data disks to an aggr….). Minimizing losses to parity is not a great solution to any capacity issue.

        • my $0.02.

      • An enterprise account I’m familiar has been using NetApp storage since F300 days and they have tested all types of configurations and have found performance starts to flatline after 16 disks.  I think the most convincing proof that 16 is the sweet spot is the results on spec.org.  NetApp tests using 16 disk RAID groups.

    • Raid group size recommendation

        • Okay, maybe not the best reference considering I was fairly active in the response on the subject in July and August of 2008 in this particular thread.  Though read through it if you like, I guess the best take away I can get from it (which I happened to have said…)
          • I was looking at this from two aspects: Performance, and long-term capacity.
          • My sources for this were a calculator and capacity documents.
          • Hopefully this helped bring some insight into the operation  and my decisions around it.
            • (Just goes to show… I don’t have opinions… only citeable evidence Well, and real world customer experiences as well;))
      • Raid group size with FAS3140 and DS4243
        • I found this in the DS4243 Disk Shelf Technical FAQ document
        • WHAT ARE THE BEST PRACTICES FOR CONFIGURING RAID GROUPS IN FULLY LOADED CONFIGURATIONS?
        • For one shelf: two RAID groups with maximum size 12. (It is possible in this case that customers will configure one big RAID group of size 23–21 data and 2 parity; however, NetApp recommends two RAID groups).
      • Managing performance degradation over time
      • Aggregate size and "overhead" and % free rules of thumb.
      • Why should we not reserve Snap space for SAN volumes?
        • All around good information, conversation and discussion around filling up Aggr’s – No need to drill down to a specific point.

    So, what does all of this mean other than the fact that I appear to have too much time on my hands? :)

    Well, to sum up what I’m seeing and considering we are in the section titled ‘out of box best practices’

    1. Best Practices and recommendations (as well as expert guidance and general use) seem to dictate a 14+2, 16 disk RG
      1. Can that number be higher.  Yes, but that would serve to be counter to out-of-box best practices, not to mention it seems your performance will not benefit as seen in the comments mentioned above (and the fact that spec.org tests are run in that model)
    2. By default the system will have a reserve, and not set to 0% – so if I were to strip out all of the reserve which is there for a reason – my usable capacity will go up in spades, but I’m not discussing a modified configuration; I’m comparing against a default, out-of-box best practices configuration, which by default calls for a 5% aggr snap reserve, 20% vol snap reserve for NAS and a SAN Fractional Reserve of 100%
      1. Default Snapshot reserve, and TR-3483 helps provide backing information and discussion around this subject. (Friendly modifications from Aaron Delp’s NetApp Setup Cheat Sheet)
    3. In order to maintain these ‘out of box best practices’ and enable for a true model of thin provisioning (albeit, not what I am challenging here, especially being that Mike completely whacked the reserve space for snapshots – Nonetheless… in our guarantee side of the house we have the ‘caveat’ of “There’s no need to auto-delete snapshots to get results” – Which is simply saying, Even if you were to have your default system out of box, in order to achieve, strive and take things to the next level you would need to enable “Volume Auto-Grow” on NetApp, or it’s sister function “Snap Auto Delete” the first of which is nice as it’s not disruptive to your backups, but you can’t grow when you’ve hit your peak! So your snapshots would then be at risk.   Don’t put your snapshots at risk!
    4. Blog posts are not evidence for updating of Best Practices, nor does it change your defaults out of box.   What am I talking about here?  (Hi Dimitris!)   Dimitri wrote this –great- blog post NetApp usable space – beyond the FUD whereby he goes into the depth and discussion of what we’ve been talking about these past weeks, he makes a lot of good points, and even goes so far as to validate a lot of what I’ve said, which I greatly appreciate.    But taking things a little too far, he ‘recommends’ snap reserve 0, fractional reserve 0, snap autodelete on, etc.    As a former NetApp engineer I would strongly recommend a lot of ‘changes’ to the defaults and the best practices as the use-case fit, however I did not set a holistic “Let’s win this capacity battle at the sake of compromising my customers data”   And by blindly doing exactly what he suggested here, you are indeed putting your data integrity and recovery at risk.   

    I’ve noticed that.. I actually covered all of the other bullet points in this article without needing to actually drill into them separately.  :) So, allow me to do some summing up on this coverage.

    If we compare an EMC RAID6 Configuration to a NetApp RAID-DP Configuration, with file and block (at least 20% of each) using out of box default best practices, you will be able to achieve no compromise availability, no compromise efficiency regardless of data type, with no need to auto-delete your snapshots to gain results.   So that’s a guarantee you can write home about, 20% guaranteed in ‘caveats’ you can fit into a single paragraph (and not a 96 page document ;))

    Now, I’m sure, no.. Let me give a 100% guarantee… that someone is going to call ‘foul’ on this whole thing, and this will be the hot-bed post of the week, I completely get it.   But what you the reader really are wondering “Yea, 20% Guarantee.. Guarantee of what? How am I supposed to learn about Unified?”

    Welcome to the EMC Unified Storage – Next Generation Efficiency message!

    Welcome to the EMC Unisphere – Next Generation Storage Management Simplicity

    I mean, obviously once you’re over the whole debate of ‘storage, capacity, performance’ you want to actually be able to pay to play (or, $0 PO to play, right? ;))

    But I say.. Why wait?  We’re all intelligent and savvy individuals.  What if I said you could in the comfort of your own home (or lab) start playing with this technology today with little effort on your behalf.     I say, don’t wait.   Go download now and start playing.

    For those of you who are familiar with the infamous Celerra VSA as published in Chad’s blog numerous times New Celerra VSA (5.6.48.701) and Updated “SRM4 in a box” guide things have recently gone to a whole new level with the introduction of Nicholas Weaver’s UBER VSA!  Besser UBER : Celerra VSA UBER v2 – Which takes the ‘work’ out of set up.  In fact, all set up requires is an ESX Server, VMware Workstation, VMware Fusion (or in my particular case, I do testing on VMware Viewer to prove you can do it) and BAM! You’re ready to go and you have a Unified array at your disposal!

    Celerra VSA UBER Version 2 – Workstation
    Celerra VSA UBER Version 2 – OVA (ESX)

    Though I wouldn’t stop there, if you’re already talking Unified and playing with File data at all, run don’t walk to download (and play with) the latest FMA Virtual Appliance! Get yer EMC FMA Virtual Appliance here!

    Benefits of Automated File Tiering/Active Archiving

    But don’t let sillie little Powerpoint slides tell you anything about it, listen to talking heads on youtube instead :)

    I won’t include all of the videos here, but I adore the way the presenter in this video says ‘series’ :) – But, deep dive and walk through in FMA in Minutes!

      Okay! Fine! I’ve downloaded the Unified VSA, I’ve checked out FMA and seen how it might help.. but how does this help my storage efficiency message? What are you trying to tell me?  If I leave you with anything at this point, let’s break it down into a few key points.

      • Following best practices will garner you a 20% greater efficiency before you even start to get efficient with technologies like Thin Provisioning, FAST, Fast Cache, FMA, etc
      • With the power of a little bandwidth, you’re able to download fully functional Virtual Appliances to allow you to play with and learn the Unified Storage line today.
      • The power of managing your File Tiering architecture and Archiving policy is at your finger tips with the FMA Virtual Appliance.
      • I apparently have too much time on my hands.  (I actually don’t… but it can certainly look that way :))
      • Talk to your TC, Rep, Partner (whoever) about Unified.   Feel free to reference this blog post if you want, if there is nothing else to learn from this, I want you – the end user to be educated :)
      • I appreciate all of your comments, feedback, positive and negative commentary on the subjectI encourage you to question everything, me, the competition, the FUD and even the facts.   I research first, ask questions, ask questions later and THEN shoot.    The proof is in the pudding.  Or in my case, a unique form of Vegan pudding.

      Good luck out there, I await the maelstrom, the fun, the joy.   Go download some VSA’s, watch some videos, and calculate, calculate, calculate!   Take care! – Christopher :)