The Promise of LTFS

A bit more than a year ago the folks behind the LTO tape format standards, primarily IBM, with some contributions from HP and Quantum, added the Linear Tape File System to LTO's feature list. While some niche markets, primarily the media and entertainment business, have adopted LTFS, it won't live up to it's promise without support from archiving and eDiscovery vendors.

Howard Marks

April 5, 2012

3 Min Read
Network Computing logo

A bit more than a year ago the folks behind the LTO tape format standards, primarily IBM, with some contributions from HP and Quantum, added the Linear Tape File System to LTO's feature list. While some niche markets, primarily the media and entertainment business, have adopted LTFS, it won't live up to it's promise without support from archiving and eDiscovery vendors.

LTFS divides a tape into two partitions, one that holds the file system metadata, and another that holds file data. With a little bit of software on a server, LTFS tapes look to the server like disks, and any application can write files to a tape just like they can write to a disk. LTFS isn't the first attempt to make tape look like disk, I remember the Backup Exec group at Seagate Software showing me a tape file system in the '90s.

The difference is that LTFS is standardized, making LTFS tapes a standardized data storage and exchange medium. Now you and I have switched from mailing floppy disks and DVD-R disks to using Dropbox, Sugar Sync and Yousendit, but when you need to move many gigabytes of data from one place to another, it's hard to beat the effective bandwidth of a box of tapes.

A box of 20 LTO-5 tapes holding 24TB of data will take roughly 12 hours to get from New York to San Francisco via overnight courier. That works out to an effective transfer rate of 2TB/hr or 4.4Gbps. If we allow 12 hours to spool the data to tape, which is about how long it would take to move from a disk stage to tape using a 6-drive tape library, the effective bandwidth is still 2.2Gbps.

Even if you were getting 20:1 data reduction through data deduplication and compression, you'd need a 100Mbps link to match the bandwidth of that small box of tapes replicating that amount of data across a network. Twenty-to-one data reduction may be achievable for backup data, but archives don't have nearly as much duplicate data as backup repositories, since each archive has just one copy of each data object. Archives of rich media, be they check images, photos from insurance claims adjuster's digital cameras, or medical images, don't reduce much at all, making that Fedex box even more attractive.

Without LTFS you'd have to be running the same backup application at both sites to send data via tape as each backup, or archiving, application writes data to tape in its own proprietary format.

In addition to providing a standard interchange format, LTFS promises big advantages to storing data in a standard format over a long period of time. If your archive program stores each object it archives as a native file in an LTFS file system, you're not dependent on a single vendor for the data mover, indexer, search engine and litigation hold functions. If your archiving vendor discontinues your current product, like EMC did with Disk Extender a few years ago, you can switch to another product and have it index the existing data without having to regurgitate it to disk and ingest it into a new archive. If you have trouble locating data, you could point a Google appliance at the LTFS repository and use Google search to find the relevant data.

We as customers should start pressuring our archiving vendors to support native LTFS as a repository option. Some vendors will respond that they support LTFS, since they support any NAS storage, but most archiving solutions store their data on disk in proprietary container files. While compressed and single-instanced containers may have made sense on disk, the lower-cost-per-GB of tape makes the flexibility of a standard storage format worth the extra storage space it takes up.

About the Author(s)

Howard Marks

Network Computing Blogger

Howard Marks</strong>&nbsp;is founder and chief scientist at Deepstorage LLC, a storage consultancy and independent test lab based in Santa Fe, N.M. and concentrating on storage and data center networking. In more than 25 years of consulting, Marks has designed and implemented storage systems, networks, management systems and Internet strategies at organizations including American Express, J.P. Morgan, Borden Foods, U.S. Tobacco, BBDO Worldwide, Foxwoods Resort Casino and the State University of New York at Purchase. The testing at DeepStorage Labs is informed by that real world experience.</p><p>He has been a frequent contributor to <em>Network Computing</em>&nbsp;and&nbsp;<em>InformationWeek</em>&nbsp;since 1999 and a speaker at industry conferences including Comnet, PC Expo, Interop and Microsoft's TechEd since 1990. He is the author of&nbsp;<em>Networking Windows</em>&nbsp;and co-author of&nbsp;<em>Windows NT Unleashed</em>&nbsp;(Sams).</p><p>He is co-host, with Ray Lucchesi of the monthly Greybeards on Storage podcast where the voices of experience discuss the latest issues in the storage world with industry leaders.&nbsp; You can find the podcast at: http://www.deepstorage.net/NEW/GBoS

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights