Issues like wear leveling and your SAS or RAID controller will determine whether SSDs succeed in your storage networking environment.
Choosing the right solid state drive (SSD) for your enterprise storage networking environment is a critical decision and equally critical is making sure you have the right controller to get the most out of the pricey drives.
The first two parts of this series covered solid state drive application and software issues and reliability and performance concerns.
In this final part, we'll cover the internal design of SSDs and the use of SAS and RAID controllers important choices for making the most of your solid state drives. If your application requires SSD-like performance, you need to make sure you're doing everything right to maximize your investment.
Your decisions in each of these areas will determine if SSDs are an unqualified success in your data storage environment or of they are just a costly addition with marginal performance improvement. Given the cost of these drives, the last thing you want is a marginal performance improvement.
Internal SSD Design
Without question, this is the most important decision that you will make: You need to look for a solid state drive that will match the performance you can deliver and the reliability you need.
The first and most important part of the decision process is understanding how much data you write and if the writes are in bursts or at a sustained rate. This is important because for a number of SSDs, bursty writes hurt performance because of wear leveling optimization, and the internal bandwidth within the SSD may not be great enough to handle the incoming writes from the channel and meet the wear leveling requirements of the firmware.
Wear Leveling: The SSD vendor's wear leveling firmware design is critical to write performance. It is important to determine if the algorithm is designed to the pre-wear level so that you have extra blocks that you can write to in advance of your writes, or considering the internal bandwidth, that wear leveling and writes will not impact the SSD.
Internal Bandwidth: The issue of internal bandwidth is likely the least discussed, but is perhaps the most important consideration when choosing an SSD. The industry is moving quickly from 3 Gb/sec SAS to 6 Gb/sec SAS for SSDs, and multiple SSD vendors are claiming over 300 MB/sec write performance. With 3 Gb/sec SAS, the maximum performance is about 384 MB/sec, and with 6 Gb/sec SAS, 768 MB/sec. If writes are coming in at say 300 MB/sec, and the internal bandwidth of the SSD is only 300 MB/sec, then the wear leveling algorithm during writes will likely have priority, and write performance could be significantly degraded. In many cases, the number of wear leveling writes is greater than the write data streaming, and in some poor implementations I have seen almost 4 to 1. That means that the write performance might drop to 60 MB/sec and the wear leveling could take 240 MB/sec. Clearly, 60 MB/sec is slower than most disk drives, especially enterprise drives, for streaming write performance. You would be surprised how many drives I have seen with this poor type of implementation even when vendors call their drives enterprise quality.
The internal bandwidth of the SSD must take into account the external bandwidth (the channel), the maximum full duplex performance of the SSD for reading and, more importantly, writing, and the maximum bandwidth that wear leveling will use based on the algorithm and number of writes. SSD internal bandwidth is in my opinion the most overlooked area of SSD planning and design, and is the most critical design consideration for write-intensive workloads.
Page 2: SAS vs. RAID Controllers
Back to Page 1
Using SSDs in RAID controllers
Even the highest-performance RAID controllers today cannot support the IOPS
of just three of the fastest SSDs. I am not talking about a disk tray; I am talking about the whole RAID controller. If you want full performance of expensive SSDs, you need to take your $50,000 or $100,000 RAID controller and not overpopulate it with too many drives. In fact, most vendors today have between 16 and 60 drives in a disk tray and you cannot even populate a whole tray. Add to this that some RAID vendor's disk trays are only designed for the performance of disk drives and you might find that you need a disk tray per SSD drive at a huge cost. You might think this is an open and shut case that says don't use SSDs in RAIDs, but it's not that simple. SSDs in RAIDs have some significant management advantages in the areas of provisioning, RAID levels, sparing and failure management.
Provisioning: Many sites are not going to use a whole SSD for a single file system or application, so you need to be able to divide up the space. RAID controllers excel at this. They support LUN creation, LUN masking and a variety of other provisioning options and features and do this all within a consistent framework that you are used to for all of your other file systems.
RAID Level: Very few of you are going to use RAID-5 or RAID-6 for SSDs, as many RAID controllers cannot sustain the performance of three high-end SSDs, much less in a RAID-5 4+1 or RAID-6 4+2 with five or six drives. Therefore, you are going to use RAID-1 for the most part. Most RAID controllers excel at this, and once again you are working in a framework that is consistent with your other file systems. One important note: If are going to use a RAID level other than RAID-1, the RAID controller must have a high-speed processor, and hardware parity generation and parity check on RAID are also needed to get as much as possible out of SSD resources.
Sparing: If a drive fails in a RAID-1 LUN, the RAID controller is well equipped to address this failure. Hot spares have been a feature of RAID controllers since the beginning. Again, having a consistent way to address the management of spares is a labor-saving advantage of using RAID.
Failure management: This is by far the biggest issue. As I wrote nearly two years ago, SMART statistics for SSDs is a big issue. If an SSD is in a RAID array, you know that the RAID vendor has worked with the SSD vendor to understand the non-standard SMART statistics that the SSD is providing, as there is no standard SMART SSD framework. Knowing that the SSD is being managed for potential failure will give you peace of mind when you consider the critical importance placed on SSDs in the storage hierarchy.
Using SSDs with SAS Controller Cards
Given the pitfalls of putting SSDs in RAID controllers, the other option is to connect SSDs into your system using SAS controller cards hooked into a PCIe
slot. The latest crop of SAS controller cards can support around the same number of IOPS as RAID controllers at a much lower cost, but with every good technology comes some bad, as nothing is perfect. These new SAS cards might be able to do the same number of IOPS, but lack most of the features that you get in a RAID high-end controller; from the list above:
- Provisioning: SAS controller cards can do this, but the software is not as extensive and flexible.
- RAID Level: SAS cards can do RAID-1. but using RAID-5 or RAID-6 with SSDs likely exceeds the onboard processor's capability of generating parity. I am aware of no SAS controller cards that validate parity on read.
- Sparing: The SAS card generally can manage this, but again the software is not as robust.
- Failure management: This is the biggest and most critical issue. RAID vendors spend months validating any drive, SSD or disk, that is sold as part of their system. As part of this validation, the whole issue of SMART data collection, proactive sparing of drives that are expected to fail, and standardized thresholding for returning of failed drives is all taken care of for you.
Many organizations will be using SSDs by the end of this year. The problem is that the storage infrastructure from the file system to the device is not designed for SSDs, which means that you are going to have to make some hard choices about what is important when implementing an SSD strategy. The goal of this three-part series was to provide you an understanding of some of the issues that need to be considered. There are no easy or simple choices; just make the best ones for your environment.
Henry Newman, CTO of Instrumental Inc. and a regular Enterprise Storage Forum contributor, is an industry consultant with 28 years experience in high-performance computing and storage.
See more articles by Henry Newman.
Follow Enterprise Storage Forum on Twitter