Recently, I have been thinking about the design of past, current and future storage platforms. It is pretty clear to me and lots of other people that our industry is quickly moving away from SAN storage and local file systems to storage-based appliances.
Thinking about this phenomenon, I have asked a few of my friends, "Are we having to dumb down storage architectures because we do not have the storage talent to manage the complexity we have, or are we moving to the appliance model because it is the natural progression?"
In my mind, this is a classic "which came first, the chicken or egg" sort of question.
A Look Back at SAN Dominance
Oh, how the world has changed from seven years ago when SAN dominated the storage environment.
The SAN revolution started with the introduction of fibre channel around 1997. It continued with little competition until about 2007 when 10 Gbit Ethernet hit the market, taking a big bite out of SANs market.
During this time, there were major changes in the market. Linux become dominant in lots of areas, and there was little progress in file system development. Why was progress so limited? Was it because Linux was free and file systems are very difficult to develop and no one wanted to pay for a file system?
NAS and other storage appliances allowed for simplified management of large storage environments. The file system, storage and management were all combined into a single framework.
In a number of large organizations, SAN management and the file system management were handled by different groups. I frequently saw lots of infighting between these groups, but the big issue was integration. The SAN management people often did not tell the file system groups the underlying storage architecture. So the file system group often created a file system that was not optimized to the storage.
Were they using LUNs 7+1 RAID-5 (often used with some enterprise RAIDs), which, of course, does not match application allocations? Or were they using 4+1 RAID-5? What should the stripe sizes be set to? What should the file system allocation be to match the stripe sizes and the underlying storage architecture? All of this was very confusing for most sites and significantly impacted performance.
Even if the groups worked together or everyone was in a single group, there was a big learning curve for tuning each RAID device, file system, network and architecture. Since these were often different parts from different vendors, staff had lots of different training classes to go to. That, of course, cost money and time.
In addition, organizations had limited performance tools that could tell them what the problems were and which mistakes were made. It was very difficult to get everything optimally configured without spending a pile of money.
SAN vendors making good margins might put people on-site. Or organizations paid consultants to fix performance problems (I know as we did a good amount of working doing this). In the end, the costs of SANs were far more than just the cost of the hardware.
At the same time, NAS appliances were getting faster and easier to configure and use. And many studies touted their lower cost of ownership.
Before 10 Gb Ethernet, there was only 1 Gb Ethernet. Compared to SAN channels, it was pretty slow. Then came 10 Gb Ethernet, which was faster than the fastest SAN channel 8 Gbit fibre channel. It remained the fastest option until mid-2012 with the release of 16 Gb fibre channel about the same time as PCIe 3.0 servers hit the market. Of course, 10 Gbit Ethernet and NAS protocol does have more overhead than SAN (SCSI), but it is fast enough and scales well enough compared to SANs.
2012: A Year of Change
As I look back in on 2012, I see it as a year that lots of things started to change in storage. There was clearly an emerging trend and movement to application-specific appliances. Hadoop appliances, a correlation appliances, other big data analysis and large parallel file system appliances--all of these types of appliances showed both significant market growth and had a significant number of new vendors coming into the market.
Most of these appliances with built-in applications have limited tuning parameters, as they are already optimized for the underlying storage infrastructure and application design. This is not to say that there are not some knobs that could be turned to improve performance, but to say the range of knobs is limited. The appliances get good performance out of the box, and the integration is already done for you. This means that people are not required to have as much storage knowledge to operate these new storage appliances.