Thursday, March 13, 2003

Moving Off the Grid

What is it with the hype over grid computing? It's a perfectly interesting technology, but somehow the boosters seem to be ignoring some fundamental facts about distributed computing.

1. Not all problems are parallelizable. Making an algorithm run on parallel programs is not an easy thing. Many problems are inherently linear, and so having a grid computing (or utility, or whatever metaphor you prefer) resource available won't help. Problems that are well-suited to parallelization:
  • Problems where the dataset can be easily partitioned, and there are few relationships between datasets (e.g. Seti@Home).
  • Problems where there's one core calculation, and you want to plug in many possible values as inputs (e.g. many types of simulations, esp. evolutionary computation or hardware design).
  • Problems with very well-understood structure, and parallel algorithms have been developed (e.g. FFT)


2. Even when problems are parallelizable, the messaging profile might defeat the grid. Many parallel algorithms run well with 64 or 128 nodes merrily chugging away, but those nodes must constantly be exchanging the current state of the calculation. This is great for 64-way massive servers (or, in a more extreme fashion, like those thousand-way boxes the Department of Energy orders from IBM periodically), but no so much if you're running over an Ethernet or, worse, the Internet.

A slightly different version of this problem occurs when the data source is the bottleneck. Could you write a thousand-way parallel version of a database query? Well, probably, but you'd also have to replicate the database a thousand times to get the benefit of it, unless you want to make the database a bottleneck.

3. Ever hear of something called Moore's Law? Look, every 18 months the bottom half of the grid market drops out, as processors become fast enough to solve problems previously only the grid could handle.

I don't mean to dismiss the technology entirely. It certainly has important, if niche, applications. But people who expect to be able to just plug in a computer into a mythical grid and, without changing anything else, suddenly have supercomputer resources at their disposal, are in for some disappointment.