Caringo Casts DICE Into Foreclosure Archive

CAS vendor's customer says improvements enable WAN replication and low-cost clustering

May 28, 2008

4 Min Read
NetworkComputing logo in a gray background | NetworkComputing

Caringo says a new version of its CAStor content-addressable storage platform is helping at least one customer predict the future of the housing markets.

The Data-Intensive Cyber Environments (DICE) research group at the University of California, San Diego, is using Caringo's latest update, CAStor 2.2, to create a voluminous digital archive of historical cultural content called the Humanities, Arts, and Social Sciences (HASS) Grid, which will span 10 campuses.

A first project, implemented over the last six months, involves storage of data on home foreclosures that were part of the infamous California red-lining of the 1930s. The data is not only maintained as an archive but is readily accessible to researching economists.

We have scanned and digitized, while creating Web 2.0 interfaces, surveys of the devastation of foreclosures in California cities that were done by a federal agency in the 1930s,” said Richard Marciano, lead scientist in DICE and director of the Sustainable Archives and Library Technologies lab, in an interview. The project is overseen by the National Archives and Records Administration and is funded by a library agency called the Institute of Museum and Library Services.

Realizing the importance of preserving this data and the impact it could have on economic forecasting, Marciano and colleagues at nine other sites have established three nodes in San Diego to make the archives available to professors throughout the University of California system. Using the latest iteration of CAStor, they're creating a series of highly scalable storage environments.Caringo says its platform creates a scalable, flat address space for use in storing digital content and file-based data while virtualizing storage across standard servers. The supplier says CAStor creates a low-cost cluster via USB drives plugged into the servers. CAStor clusters scale from 1 Tbyte to multiple Pbytes in a single tier of storage, according to the vendor.

For DICE's Marciano, other additions besides the clustering have clinched his use of CAStor. "I’m especially interested in a new feature called Wide Area Replication in CAStor 2.2,” he said. “Essentially, we can grow our own cluster much like we are now, and we can replicate from one cluster to another." Marciano's team is already replicating data across their three-node cluster in San Diego, but they will use the CAStor feature to extend the HASS Grid across ten campuses in the statewide university system.

"WAN replication fits right into a university environment where shared content is paramount," Marciano noted. "Basically, it gives us the opportunity to grow locally while still building distributed capacity across the entire network.”

This kind of flexibility will be critical to the project's future: With well over a petabyte of data, the archiving project is growing exponentially. And although CAS storage is just one part of the group’s entire storage strategy, Caringo’s CAStor 2.2 has provided a number of advantages that it may want to pursue in other projects.

“CAStor 2.2 allows us to go beyond virtualizing storage across a cluster to scaling across clusters and replicating content in a wide area network, choosing what collections are replicated and how, based on sets of rules a user can set up,” Marciano said. “These features really interest us in moving forward.”Caringo also improved CAStor’s feature called Fast Volume Recovery, which has proved to be quite important to DICE. According to the company, a failing node within a CAStor cluster will be rebuilt much more quickly than in the past, and it should improve data availability and reduce recovery time.

The latest iteration of CAStor is available now, priced at about $1,500 per Tbyte.

Caringo insists DICE's story points to how it's succeeding in going beyond its niche. Like its competitors -- Bycast, EMC, ExaGrid, HP, HDS, IBM, Nexsan, Permabit, and Sun -- Caringo resists attempts at classification by terminology. Even the term "CAS" meets resistance from these suppliers.

Instead, Caringo, which claims to have over 30 customers and has previously demonstrated an affinity for spin-doctoring, prefers to say it's involved in "more than archiving."

"We have been put into that niche area of being archive storage, but we see customers using CAStor for clustered storage for online data access as well as for archiving," says Derek Gascon, Caringo's VP of marketing. "We can build an online cloud of storage that serves as the infrastructure for customers who are actively serving rich media content."The message appears to be getting through at UCSD. "Caringo was the only vendor we found that could provide us with the tools to create... the HASS Grid," Marciano said in a statement.Have a comment on this story? Please click "Discuss" below. If you'd like to contact Byte and Switch's editors directly, send us a message.

  • Bycast Inc.

  • Caringo

  • EMC Corp. (NYSE: EMC)

  • ExaGrid Systems Inc.

  • Hitachi Data Systems (HDS)

  • Hewlett-Packard Co. (NYSE: HPQ)

  • IBM Corp. (NYSE: IBM)

  • Nexsan Technologies Inc.

  • Permabit Technology Corp.

  • Sun Microsystems Inc.

Read more about:

2008
SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox

You May Also Like


More Insights