Why SSD based arrays and storage appliances can be a good idea (Part I)

Part 1 of the good post from Greg Schulz (thank you)

Robin Harris (aka @storagemojo) recently in a blog post asks a question and thinks solid state devices(SSDs) using SAS or SATA interface in traditional hard disk drive (HDD) form factors are a bad idea in storage arrays (e.g. storage systems or appliances). My opinion is that as with many things about storing, processing or moving binary digital data (e.g. 1s and 0s) the answer is not always clear. That is there may not be a right or wrong answer instead it depends on the situation, use or perhaps abuse scenario. For some applications or vendors, adding SSD packaged in HDD form factors to existing storage systems, arrays and appliances makes perfect sense, likewise for others it does not, thus it depends (more on that in a bit). While we are talking about SSD, Ed Haletky (aka @texiwill) recently asked a related question of Fix the App or Add Hardware, which could easily be morphed into a discussion of Fix the SSD, or Add Hardware. Hmmm, maybe a future post idea exists there.

Lets take a step back for a moment and look at the bigger picture of what prompts the question of what type of SSD to use where and when along as well as why various vendors want you to look at things a particular way. There are many options for using SSD that is packaged in various ways to meet diverse needs including here and here (see figure 1).

Figure 1: Various packaging and deployment options for SSD

The growing number of startup and established vendors with SSD enabled storage solutions vying to win your hearts, minds and budget is looking like the annual NCAA basketball tournament (aka March Madness and march metrics here and here). Some of vendors have or are adding SSD with SAS or SATA interfaces that plug into existing enclosures (drive slots). These SSDs have the same form factor of a 2.5 inch small form factor (SFF) or 3.5 inch HDDs with a SAS or SATA interface for physical and connectivity interoperability. Other vendors have added PCIe based SSD cards to their storage systems or appliances as a cache (read or read and write) or a target device similar to how these cards are installed in servers.

Simply adding SSD either in a drive form factor or as a PCIe card to a storage system or appliance is only part of a solution. Sure, the hardware should be faster than a traditional spinning HDD based solution. However, what differentiates the various approaches and solutions is what is done with the storage systems or appliances software (aka operating system, storage applications, management, firmware or micro code).

So are SSD based storage systems, arrays and appliances a bad idea?

If you are a startup or established vendor able to start from scratch with a clean sheet design not having to worry about interoperability and customer investment protection (technology, people skills, software tools, etc), then you would want to do something different. For example, leverage off the shelf components such as a PCIe flash SSD card in an industry standard server combined with your software for a solution. You could also use extra DRAM memory in those servers combined with PCIe flash SSD cards perhaps even with embedded HDDs for a backing or preservation medium.

Read on here

 

Leave a comment

15 Comments

  1. No worries Roger, thanks for sharing with your visitors, hope all is well.

    Cheers
    gs

    Reply
  2. Well. I don’t really see any points here for engineers who architect enterprise systems. Yeah, you can use SSDs in different ways.
    For example it would be interesting to hear from people who made a tough choice and how it worked for them. For example, you have a SAN with bunch of SSD disks running Oracle.
    What would be better:
    Solution 1: Allocate SSDs for fast cache.
    Solution 2: Allocate SSDs for automatically tiered storage and let SAN handle spikes.
    Solution 3: Old hardcore DBA way – measure I/O requests and put Oracle tables that have most of I/O pressure.
    Can anyone comment which way you selected? We selected mixed approach.

    Reply
    • Hi

      in my opinion it really depends. As an old DBA (Oracle 7 & 8) i know where and who my “usual suspects”
      are. For that solution 3 makes sense (but you need to know what you are doing). Personally i like solution 1 – get as close as you can to the CPU with the SSD (PCIe cards). Solution 2 can make sense for some workload but it’s rather static (How fast will it react to spikes in loads ?) ->mixing the solutions makes sense !

      -Roger

      Reply
      • Now I see that the experience is talking. Yes, we did mix of 1 and 3. It is a very interesting subject. Some vendors are saying – we have such a smart software to manage data, forget your old time DBA, throughput, I/O view. Just make a pool with automatic tearing and enjoy.
        Well… sorry… probably younger guys would easily jump on it and don’t even think how it works behind the hood. Guys like you and I would not trust the technology. I think it is a religion war. In my 25 years in technology I saw many failures and heavy mistakes. I guess you cannot teach an old dog new tricks. Not that they are hard, we just don’t like them…

      • The same here.. i trust technology just to a certain point. There is also intuition, experience etc. 26 years in IT don’t leave you unmarked :-)

  3. “Well. ”

    Ok, good points and discussion, however also realize that this was the first in a series of posts that also linked to and referenced other content including what DBAs or IO or storage engineers that do not know how to use the technology can use to learn how to use what, where, when and why.

    However to address your points, ok, from my experiences which is limited to 30+ years being an applications programmer/developer, business systems analyst, systems programmer, systems admin, performance and capacity planning analysts, infrastructure architect, IO and storage performance systems engineer working with, using and selling SSD across different databases and apps. (sorry, while helped many DBAs, never aspired to get the title ;), as Roger pointed out, “It depends”…

    For example, a couple of decades ago on a non-mainframe system when as a launch customer for a new SSD solution, we (a team of applications people, DBas, sys progs and system admins) profiled the applications and their IO characteristics (reads, writes, size, response time, etc) including adding instrumentation to code to collect what was not otherwise available via OS, third party, DB or other tools. This information was used to identify hot spots, help determine how to configure storage (SSD and disk), what files or applications to put on SSD vs. HDD and so forth. We found that the easiest things to do was to put the entire applications onto SSD, however it was also the most expensive. In that situation the solution was to place application as well as database logs, journals, queue or message files, scratch and temporary space, as well as some indices and a few small yet very frequently accessed tables onto SSD. In other words, stretched the SSD IO capabilities further, while putting larger, yet less frequently accessed tables on larger capacity, lower cost HDDs with semi active and reasonable sized tables and other data on fast HDDs.

    Moving on from being a customer, as a vendor who both manufactured and OEMd SSD, helped many sys admins, storage arch, DBAs and others address various issues that again took different configuration paths and approaches depending on the specific needs. This involved various databases from Oracle to DB2/UDB to Postgress to Sybase to Mumps (ok, some will not call that a DB, neither would they consider Pick in that category ;) among others…

    Some approaches used SSD in arrays, some used SSD as device or as a caching appliance, some used RAM caches and other approaches, and however the common theme was it depends…. Have also encountered situations where the customer thought SSD was needed only to take a step back, use various tools (if you are an experienced DBA you know which ones to use ;) to find or locate a problem query that once fixed, eliminated or delayed the need for SSD. However, in those experiences the solution was actually to use SSD as a means of buying time while other applications, queries or DB changes or enhancements were made that in the end, ultimately made even better use of the hardware while enabling the software to be more productive as well.

    Thus, as Roger points out “it all depends…”

    Could go on, have to get back to some other projects right now, however if you have specific questions, be happy to address, after all, you can teach an old dog new tricks, that is if the old dog wants to keep learning, likewise you can teach a new dog old tricks, that is if they want too…

    Btw, if you are planning on being in the Netherlands the week of May 7th (e.g. in a little over a week), I’m going to be doing a series of seminars in different venues that among other things will discuss SSD including what to use, when, where, why and how to address different issues including databases. Learn more here: http://storageioblog.com/?p=2874

    Cheers gs

    Reply
    • Another page of blah blah blah from 30+ years of experience and self promotion.
      Let me narrow down this “it depends”. My question was: “should we trust auto tearing or manually identify hot spots and stick them to SSDs”. Does anyone have any opinion on that. I agree with Roger: “i trust technology to certain point”. I would argue that auto-tearing logic is still fresh.
      Nice blah blah blah I have heart from Confluence recently: “Our software is so smart and fast that you don’t need fast cache”. Why are they kidding??? Just another marketing fluff from a theorist…

      Reply
  4. @mrmichaelpetrov call me gluten for punishment so let us try this again.

    I too am not in marketing or sales, and by education, training, and experience having been an engineer. Today I work with marketing, sales, engineers, architects and others on practice and implementation as well as theory (hope that is not more blah blah blah).

    Not sure what your concern or focus about traffic is about, I too like facts and numbers and digging in, as well as sharing that information (where possible) which is what I do with my posts. However, by looking at facts and figures, using metrics and measurements helps me figure out where to take some time and interact on different venues such as this one ;).

    Speaking of my posts, I am not a paid blogger; you might have me confused with someone else. I do posts on my own time as a means of sharing information, stimulating discussion, listening, learning, exchanging ideas and so forth, again, hope that is not more blah blah blah, ok, enough with that lets try and move on.

    Good to hear your experiences with the various technologies, when you tout IOPS, what size are they out of curiosity, are they reads/writes, random/sequential, where/how measured.

    Note that these questions are for curiosity, to learn more what you are doing, how you are doing and to make comparisons to other situations, e.g. things techies or engineers or similar would want to know and dig into. For example, there is a storage solution in the market (not EMC) that touts high performance, many IOPS, yet has lousy latency, there is another solution that touts SSD as making it fast, yet its latency with SSD is about the same as some other vendor’s controllers with HDDs. Hence my curiosity of what the workload/activity looked like (io size, r/w, latency, etc) when numbers are tossed around.

    Again, hopefully this is not more blah blah blah as would be more fun having an exchange, discussion, interaction, learning, sharing etc.

    Ok, nuff said, over and out…

    Cheers
    Gs

    Reply
    • OK, I was judging NJIT high tech project team competition yesterday. Spend the whole day with students. Interesting.
      So back the storage. The SAN does 3 things. VMWare cluster, Oracle cluster and SQL Server cluster. SQL predicatively does up to 64K on average. VMWare does somewhere around 16K and Oracle is also well known – DB_BLOCK_SIZE X DB_FILE_MULTIBLOCK_READ_COUNT (8182 X 16).

      As for latency. You can see the latency on servers for the network shortcoming or misconfiguration. I don’t really pay too much attention to latency. I usually check queues. Gives faster view of what is going on. Technically if queue is empty, you don’t care about anything else.

      I hope I answered your questions. Just wanted to remind, I was hoping to get an opinion about using manual SSD allocation vs auto tearing.

      Thanks.

      .

      Reply
  5. gregschulz

     /  May 2, 2012

    @mrmichaelpetrov

    No worries, been tied up on some projects including SSD performance items past few days.

    Thks for the info.

    How many servers are accessing the SAN, is the SAN just one storage device or multiple arrays? 64K for SQL, is that as in 64K IO size, or 64K IOP? If 64K sounds like good grouped, read/write or prefetch occurring vs. small size. If that is the case, should be pretty cache friendly either using cache cards, or cache in an array (assuming array has good cache algorithms). Is the 64K across the entire SQL instance, or for tables, indices, or journals?

    A trend I have been watching for some time now is how the average IO size for DBs is moving up into the neighborhood that you mention, while most people still toss around 4K as an average DB IO size, hmmm, wasn’t that a decade or so ago ;)…

    Let me guess, the VMware 16K if IO size are also very random and fragmented (e.g. split IO) IOS?

    Interesting note about queues and latency, most SSD discussions tunnel vision in and around IOPS or bandwidth, however there is the other aspect which how fast those xfrs or IOPS are being processed. If you are looking at queues, that’s an alternative to looking at latency given how they are inter-related which is much better than just focusing on iops or xfrs particular when you correlate queues with activity arrival rate.

    Per manual vs. auto tiering of or for SSD, Im general skeptical until I see how the auto mode works and behaves. That’s where the value of tools that can run and give a recommendation that you can then approve or override comes into play, as well as gain confidence in the movement algorithms. Now to take a slight different angle, lets separate automated decision making of what to move and when, with the actual movement if that makes sense.

    In other words, the decade’s old challenge with SSD is that you can use different tools to find the volume with queues (e.g. candidate for deeper dive) and long running queries or complaints. Then use db tools or other tools to find the offending file or object, index or whatever. However IMHO the real pain was taking the disruption to move to the SSD. That is where tools that can do the movement transparently have value, however either under your control, approval or if you have confidence, let it do its thing.

    Hope that helps, be happy to elaborate more or touch on other topics.

    Btw, if you are really into analyzing performance metrics and have some windows systems either virt or physical, check out HioMON (http://www.hyperio.com/ ). I use it now and then for testing, experiments, trying things out. For example did some tests before and after going from HDD to HHDD, some SSD and other things including profile boots, shutdowns, normal activity at the file system as well as down below the file system.

    Cheers
    Gs

    Reply
    • Man, 64K is IO size (not IOPS). You configure OS/SQL to write 64…you need to format FS with 64K chunks.
      Oracle is even better. Sometimes jumps to 256.

      Reply
      • gregschulz

         /  May 2, 2012

        64K IOPS, yup, would not be a surpise if a rather “busy” enviroment, hence the question.

        Good to hear you are using large IO size of 64K to group the writes, help with pre-fetch as well as using 64K chunks for FS. Have you switched, or done any experimenting with 4K block (e.g. below the filesystem) formatting of storage yet, its still relatively new, how shift underway from traditonal 512 byte page/block to 4K.

        Out of curiosity, with your larger IO size for the DBs, what are you using for log switch/consistency points, do you go longer or shorter intervals, or does that depend on the type of system. E.g. if you are doing more reporting/analytics/queerys/reading a longer interval, or if transaction, shorter interval?

        Speaking of the abvoe configuration, did you also set the underlying storage with larger RAID chunks or use the defaults?

        Btw, what type of activity are you seeing on the journals/transient objects and files vs. tables or other files? Reason ask is tying back to Roger comment above, concur that the best IO is no IO and the second best IO has locality of reference as close to the app as possible, e.g. using cache in the server (DRAM) then a persisten flash cache card (or target), then persistent (or protected) cache in a storage system, then target (SSD, HDD, etc).

        gs

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,162 other followers