The Facebook Data Center Delusion

There's a lot of talk about how enterprises should run their data centers like Facebook and Google for increased efficiency, but emulating web-scale companies won't pan out for the average enterprise.

Howard Marks

October 8, 2014

3 Min Read
Network Computing logo

Some of my fellow analysts have decreed that, since web-scale organizations like Facebook and Google can run their complex and mission-critical applications at a much lower cost than enterprise data centers can crunch equivalent bunches of numbers, we should all run our data centers the way web-scale guys do. I don't think so.

Web-scale organizations like Facebook and Google run a relatively small number of applications, each of which supports a very large number of users. Yes, Google has tens or hundreds of applications, from the search engine itself to YouTube and Google Mars, but Google applications have to be embraced by hundreds of thousands of users, or, like Google Reader, they'll get the ax.

By contrast, the corporate data center is filled with hundreds or thousands of applications, and many of them may have just 50 or 100 users. Applications such as MRO (maintenance, repair, and overhaul) or HR's benefits calculators abound.

When you're building web-scale applications, every hour of developer time writing the application will be amortized across the thousands of servers and millions of users the application is designed to attract and support. As a result, it's more cost effective to make resiliency an application function, rather than relying on the infrastructure to provide resilience.

If it takes 2,000 programmer hours to write each uploaded photo to three independent servers and simply to try server No. 2 when a read of server No. 1 fails, that's a lot less than the cost of a SAN to provide the same resiliency.

In the corporate world, we buy, rather than write, many of the applications running in our data centers. Those applications, and application platforms like SQL Server or Oracle, are designed to run on a resilient infrastructure. If SQL Server can't read its databases or write its transaction logs, it doesn't try to access the second copy -- it fails. Hopefully, it fails over to an always-on cluster partner.

Even for the applications an organization develops itself, application complexity and the additional testing that complexity would require would make the resiliency at the application layer model work only if that application were sharded across a large number of servers to serve a very large number of users. After all, the 20 TB or 50 TB of resilient disk space a new HR application needs would cost a lot less than that 2,000 man-hours of coding and testing.

It's also important to note that, though web-scale applications are designed very differently from corporate applications -- even huge corporate applications like an airline ticketing and reservation system -- the web-scale organizations use much more conventional architectures to manage their own businesses. My sources have confirmed that Facebook runs its financial, payroll, and HR applications using database servers that run SQL and store data on dedicated storage arrays.

In short, as the old saying goes, "Bet different horses on different courses." I just don't believe the web-scale application model works for many of the applications in the corporate data center.

Now, should we run our data centers more like cloud computing providers like Rackspace or HP Helion? That's a more interesting story.

Read more about:

2014

About the Author(s)

Howard Marks

Network Computing Blogger

Howard Marks</strong>&nbsp;is founder and chief scientist at Deepstorage LLC, a storage consultancy and independent test lab based in Santa Fe, N.M. and concentrating on storage and data center networking. In more than 25 years of consulting, Marks has designed and implemented storage systems, networks, management systems and Internet strategies at organizations including American Express, J.P. Morgan, Borden Foods, U.S. Tobacco, BBDO Worldwide, Foxwoods Resort Casino and the State University of New York at Purchase. The testing at DeepStorage Labs is informed by that real world experience.</p><p>He has been a frequent contributor to <em>Network Computing</em>&nbsp;and&nbsp;<em>InformationWeek</em>&nbsp;since 1999 and a speaker at industry conferences including Comnet, PC Expo, Interop and Microsoft's TechEd since 1990. He is the author of&nbsp;<em>Networking Windows</em>&nbsp;and co-author of&nbsp;<em>Windows NT Unleashed</em>&nbsp;(Sams).</p><p>He is co-host, with Ray Lucchesi of the monthly Greybeards on Storage podcast where the voices of experience discuss the latest issues in the storage world with industry leaders.&nbsp; You can find the podcast at: http://www.deepstorage.net/NEW/GBoS

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights