The Web Data Dilemma

You've sorted out your back-end storage, but what about your Web data?

December 19, 2006

3 Min Read
NetworkComputing logo in a gray background | NetworkComputing

3:20 PM -- When I started my first job in 1994, people were still scratching their heads and working out exactly what the Internet meant for businesses. Back then, in the days of the "Information Superhighway" -- and the still-ubiquitous fax machine -- we knew we were on the verge of something big.

Now, more than a decade later, many businesses are faced with something big indeed -- their corporate Websites. And they have scarier problems than restricting employee access to X-rated sites or worrying about hackers or that weird guy from accounting who never emerges from his cubicle.

The sheer volume of data on the corporate Intranet is apt to be top of mind for CIOs. Quite simply, many businesses don't know what they've got, yet they're under pressure to classify this data and index it for compliance, regulatory, and business purposes.

Enter Google. The Mountain View, Calif.-based search giant has already made a song and dance about Web classification, unveiling a high-end search appliance called the OneBox aimed at enterprises and a low-end Mini device for smaller firms. (See Google Unveils New Mini.)

Google touts its search devices as a way for users to target data on specific business intelligence applications. The OneBox, for instance, trawls the Web with workforce automation, customer resource management (CRM), enterprise resource planning (ERP), and business intelligence packages. (See Google One-Ups Intranet Search, Google Intros OneBox, Google Bangs Application Drum, and SAS, Google Join for Search.)Other vendors are also jostling for position in the search space. Earlier this month IBM and Yahoo, for example, introduced free software which they claim will let IT managers quickly find Web-based data. OmniFind Yahoo! uses the open source Lucene indexing library and represents both vendors' first step into a growing market.

There are other suppliers on this bandwagon. Cleveland-based Thunderstone, which recently upgraded the features on its own search appliances, is also preaching a data classification gospel. Thunderstone is pushing Google on pricing and has put its $10,000 APP250 device up against Google's $30,000 GB-1001 model OneBox. (See Thunderstone App Gets New Features.)

According to Thunderstone, About.com, the Associated Press, Corbis, eBay, GE, and QVC are using the vendor to classify Intranet data.

Although Google, which can handle 500,000 documents on its GB-1001, has a capacity edge over the APP250, which handles around 250,000 docs, it is good to see competition. Long-term, this can only spell good news for users.

Firms should also expect to see these devices merge with other storage offerings during the coming months amidst growing synergies between search engines and data classifiers. (See Classifiers Grab Search Partners.) Kazeon, Mathon, and StoredIQ, for example, have all paid their $10,000 to join Google's program for integrating its Intranet search appliances with other vendors' wares. (See Content Classifiers Glom Onto Google and Silicon Joins With Google.)I think we'll see plenty more activity in this space during 2007.

Users, apparently, are open to suggestions. In a recent Byte & Switch poll, some 45 percent of respondents said that they are already using third-party data classification and search tools. (See Search Is On.) And most firms said they are drowning in a sea of unstructured data.

Bottom line? It is relatively early days for the search appliance market, but one thing is for sure -- firms' Web-based data will not be shrinking anytime soon.

Watch this space.

James Rogers, Senior Editor, Byte and Switch

  • eBay Inc. (Nasdaq: EBAY)

  • Google (Nasdaq: GOOG)

  • Kazeon Inc.

  • Mathon Systems Inc.

  • StoredIQ Corp.

  • Thunderstone Software LLC0

Read more about:

2006
SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox

You May Also Like


More Insights