Dumb and Dumber Switches?
In a recent trade press article, Jacob Farmer of Cambridge Computer Systems defiantly declared his opposition to the industry trend towards intelligence in fabric switches. In taking this contrarian position, Jacob has braced himself against a backlash of opposing views, but so far there has been little public response by switch vendors or customers to his valid points.
Jacob’s essential argument is that fabric switches should simply switch, efficiently moving data from A to B, without the encumbrances of additional storage processing. Like LAN and WAN switching technology, the role of fabric switches in his view is to provide high performance, low latency transport. Adding storage management or virtualization functionality to fabric switches increases both the complexity and cost of products that, according to Jacob, should be as simple to manage and support as possible.
The desire for simple, efficient, and interoperable SAN switches is understandable. Ethernet switches, for example, may offer basic quality of service (QoS), VLAN, or flow control options to optimize data transport, but are generally transparent to the end devices they serve. By contrast, much of the complexity attributed to SANs stems from switch-related issues, both from interoperability issues with HBAs and disk arrays as well as from a legacy of interoperability issues between different vendor’s switch products.
As Jacob rightly states, transparency, simplicity, and interoperability are assumed in the LAN world. No one expects a horde of field service engineers to descend if a customer plugs a Foundry Big Iron into a Cisco Catalyst. Customers do, however, have to exercise caution when connecting fabric switches together, even if the switches are from the same vendor. Microcode levels must be compatible, interoperability modes must be set, and enhanced features sacrificed when crossing vendor lines.
Page 2: Why Such Complexity?
Continued from Page 1
Why Such Complexity?
This complexity, though, is not the result of simple sloth on the part of vendors. Even the notorious fabric interoperability issues of the past were not entirely due to underhanded competitive maneuverings (although a few switch vendors crafted subversion of interoperability into something of an art form).
Despite Jacob’s longing for simpler times, the painful fact is that fabric switches are necessarily powers of ten more complex than their LAN cousins. By the nature of the devices it connects, storage networking demands more intelligence in the SAN.
In a LAN environment, intelligence resides in the end systems or hosts, which are typically computer platforms such as servers or workstations. Through end-to-end protocols such as TCP, the hosts are responsible for establishing and maintaining sessions with their peers on the far side of the network. The network itself, composed of Ethernet switches, IP routers, or switched optical infrastructure, is primarily responsible for expeditiously moving data from source to destination. A workstation, for example, is not required to log on to an Ethernet switch to register its presence.
In a SAN environment, by contrast, an end device such as a JBOD may be relatively dumb. Storage targets in general are passive participants in the SAN, waiting for active intervention by an initiator (server) to establish communications across the SAN. Fabric switches must accommodate the disparity of intelligence on the end systems by providing logon services, simple name server (SNS) registration, and state change notification (RSCN) broadcasts to signal changes in the SAN. In the early days of Fibre Channel fabrics, these basic intelligence services were the focus of interoperability testing and debugging just to ensure that the attached servers and disk assets obeyed common standards of behavior in order to properly communicate with the fabric.
Even in a single-vendor, single-switch environment, fabric switches therefore require more sophisticated logic than LAN switches. Additional layers of complexity appear when fabric switches are connected to build multi-switch SANs.
The original architects of Fibre Channel decided that fabrics should be self-configuring. Consequently, in addition to fabric-based intelligence in the form of logon services, SNS registration, and RSCNs, fabric switches require intelligence to perform fabric building, exchange of SNS data, zoning information, and routing tables.
Fabric building, for example, involves an intensive exchange between multiple switches to determine which will be the principal switch responsible for allocating unique 64K address blocks to the other switches. If the principal switch leaves the fabric, or if an operational switch is inadvertently inserted into an established fabric, switch-to-switch protocols trigger a fabric reconfiguration process.
This may require the suspension of ongoing storage transactions, logging off servers and arrays, and/or reallocating addresses to the end devices. Successfully monitoring this procedure is no trivial task, especially when tens of switches are involved. Likewise, melding zoning information and SNS entries between multiple switches requires processing overhead to ensure that the proper devices are mapped across inter-switch links.
Page 3: Simplicity and Complexity Coexisting in Harmony?
Continued from Page 2
Simplicity (of Operation) and Complexity Coexisting in Harmony?
The inherent complexity of fabric switches does not automatically solicit even more intelligence in the form of storage management or virtualization. Fabric-based intelligence is, however, a fact of life that cannot be wished away in hopes of making SANs as simple as LANs. The processing power required for fabric services and inter-switch communications has already been expanded to include additional storage features such as zoning, LUN-masking, and third-party copy.
The assumption of additional value-added services by the fabric such as storage pooling, heterogeneous data replication, and snapshots is just a matter of engineering investment and time. While this adds significant complexity to the design and management of fabric switches, the real problem is not the application of intelligence to switches, but the intelligent application of advanced features to switches. More intelligence is needed in the design, implementation, and automation of storage processes in the fabric so that smart switches truly off-load tedious tasks from the user.
The valid core of Jacob’s argument is that fabric switches should be simple to operate. As in the evolution of all computer technology, however, simplifying operations does not mean that complexity goes away. It just means that the complexity behind the scenes is hidden from the end user, thanks to heaping helpings of intelligence that have been applied to the underlying architecture and product design.
A graphical interface, by analogy, greatly simplifies otherwise laborious command-line instructions. Implementing a GUI adds complexity to the operating system and applications, but that underlying intelligence is transparent to the operator doing the clicking and dragging.
The challenge for fabric switch vendors who embed more and more intelligence in their products is to simplify and streamline the user interface, automate management tasks, and provide proactive monitoring and diagnostics to automatically correct impending problems. Smart design will drive truly useful smart switches into the market, and preempt the Luddite backlash that will inevitably occur if insufficient intelligence is applied.
Director, Solutions and Technologies, McDATA Corporation
Author: Designing Storage Area Networks, Second Edition (2003) (available at Amazon.com), IP SANs (2002) (also available at Amazon.com).
See All Articles by Columnist Tom Clark