Key-value is a fundamental data representation that is invading the storage world beyond just archive or other cold storage. With new technology based on key-value, it's quickly coming to higher performing storage. Is key-value storage in your future?
In June I went to the 30th International Conference on Massive Storage Systems and Technology (MSST 2014) and over the 4 days I was there I saw a theme rise to the top that I thought was quite interesting. Two years ago I barely heard the subject raised at the MSST conference, but at MSST 2014 I saw it mentioned several times a day. The subject is key-value storage.
Key-value is a fundamental data representation or data structure. It is very simple consisting of (key, value) pairs so that a keyis not repeated in the data set (i.e. it's unique). If you want to retrieve the value associated with a key all you need is the key itself. The number of operations used with key-value pairs is fairly small.
- Add: Add a pair to the collection of data
- Remove: Remove a pair from the collection of data
- Reassign: Change or reassign the value associated with a key
- Get: Get a value associated with a key
If you've ever examined object based storage (or file systems), then these operations should look very familiar to you. In fact, object storage such as Amazon's S3, Openstack's Swift, Caringo, and others, use these same basic operations for their storage system. These approaches to file systems have been around for a few years so they're not exactly brand new.
So why was key-value storage such a hot topic at MSST 2014? The simple reason is that it is simple and there are some new technologies that make key-value storage even easier and perhaps more applicable to storage that is faster than previously used.
From my observation, key-value storage is becoming the back-end for not just object based storage and file systems but a great many file systems, including ones that traditionally may have used block storage. The reason for this is that key-value storage is very simple. But being simple doesn't always push adoption of a new technology. Seagate has advanced this technology with the development of a key-value based, Ethernet attached hard drives. It is an example of where key-value storage is headed.
Seagate Kinetic Drive
I'm not sure if everyone has seen the new Segate Kinetic Drive technology but many people are talking about it and demonstrating what can be done with it. To create the kinetic drive, Seagate took the basic hard drive, removed the SAS or SATA interface and replaced it with two Ethernet interfaces and a simple processor. There are some power connection changes as well but the form factor is exactly the same with the Kinetic drive as the regular SAS drives.
The benefit of replacing the SAS interface with an Ethernet interface is that it strips away all of the intermediate layers between the application and the drive itself. This intermediate layer is all of the POSIX function calls, the file system, volume manager, drivers, and the storage server which can have RAID controllers, caches, the SAS controller, and on and on.
With the Kinetic drive, the application talks to a library layer that a developer creates that takes the place of the file system. Then the library talks directly to the Kinetic drive using TCP/IP. This greatly reduces all of the IO latency between the application and the actual storage. But how do you use the drive now that it only has an Ethernet interface?
Segate has turned the Kinetic drive into a key-value pair storage device with an interface that you can access via several programming lanugages including Java, C++, and Python. These interfaces have just a few simple client API's:
Do these functions look familiar? They should. The Kinetic drive has some other commands to help out the developer as well.
The drives also have an administrator API so that they can be managed and monitored. This includes setup features for the drive, security, and the ability to get logs from the drive. To use the Kinetic drive you have to issue commands to it in terms of PUT, GET, and DELETE, basic key-value functions.
While Seagate won't talk too much about what's exactly inside the drive, it is basically a simple key-value database and presumably some simple OS. Underneath the covers the drive does the space mapping for you including any garbage collection that needs to be done. The drive stores keys in the range of 1 byte to 4 KiB. The value is stored in 0 bytes to 1 MiB. Each drive can have multiple masters from a authentication and authorization stand point.
File systems can be written or adapted to use these drives as the storage back-end via open source programming libraries to interface with the drives. Alternatively, a fairly simple IO library can be written so that applications can do IO to/from the drives. The design of the application interface is up to the developer(s).
As a starting point the file system will need to break files larger than 1 MiB into 1 MiB chunks. All of the chunks can be sent to the same drive, spread them across multiple drives, or spread them around with copies of each chunk. You can do almost anything you want including replication, snapshots, striping, RAID, or just about anything else you can imagine. You can write the IO interface using the Kinetic libraries and the simple key-value commands.
The current Kinetic drive has two Gigabit Ethernet interfaces. The performance numbers that Seagate showed at the April SNIA showed sequential read and write performance of about 50 MB/s. The random write speed is also about 50 MB/s but the random read rate is 1.2x slower than traditional drives. So the performance is comparable to what you might see from a regular SAS drive.
There are some file systems that have demo-ed using the Kinetic drives as a back-end. For example, Swiftstack demonstrated a Kinetic backed version of Swift at the OpenStack conference in May of 2014. It is targeted at slower, cheaper storage that might be used for archiving data. The Fred Hutchinson Cancer Institute has deployed this Swift-Kinetic solution in production with great success.
Key-value storage limitations
Key-value storage seems pretty good doesn't it? With a key you can easily look up the corresponding value, delete the key-value pair, or replace the value in the key-value pair. The Seagate Kinetic drive does exactly this. The developer writes a library or layer that retrieves the requested data (read), writes the requested data (write), or erases the data (delete).
But storing data also requires metadata. There are many ways to do metadata operations, including storing the mapping from the file to the keys as part of the data in the Kinetic drives or perhaps in a separate database. The developer(s) write all of the functions and operations into an IO library.
This sounds very easy but the IO portions of the application will have to be re-written to use the specific IO library that talks to the Kinetic drives. Key-value storage is not exactly POSIX compliant so that means that either the storage interface will have to be new or the library layer will have to be written to accommodate POSIX IO function calls.