Why Clusters?
High performance at modest cost—what’s not to like? This document provides a brief introduction to compute clusters and explains how they can deliver considerable computing power at modest cost and, thanks to professional open source cluster software, ease of administration and use.
Problem Statement
Do any of the following descriptions sound familiar?- Your current computers are overloaded by the amount of processing required to analyze available data, verify more sophisticated designs, or meet user demands for improved response time. Your staff are clamoring for more computing power, but you don’t think you can afford it.
- You operate an expensive mainframe computer. Your computer salesman insists this system is the only way to get your work done in a timely manner, but you feel there must be a cheaper way.
- You hear about new analysis and modeling methods that you suspect could make a major difference to your business. If only, you think, you had a supercomputer.
If so, then compute clusters may have something to offer you.
Solution Statement
A cluster is a set of commodity processors and an interprocessor communication network, integrated in a manner that allows for modest machine room footprint and convenient administration. Because today’s microprocessors are as powerful as supercomputers, clusters can pack a big punch in a small package. And because clusters are constructed from commodity parts, their cost can be far lower than that of other high-performance computing solutions.
Some of the most powerful computers in the world today are clusters: in November 2007, 81% of the computers in the “Top 500” listing of the world’s most powerful supercomputers were clusters. But it would be a mistake to assume that clusters are only for big research centers. Indeed, a 2007 IDC report states that 25% of all spending on clusters used for highly computational or data-intensive processing (out of $10B total spent on such clusters) is for workgroup clusters with value $50,000 or less. That’s more than 50,000 such clusters sold in 2007 alone. (Another 36% are for departmental clusters, with value $50,000 to $250,000. Larger divisional and corporate systems account for the final 39%.)
The first clusters were piles of PCs with the monitors removed, running software put together by enthusiasts. They were cheap, but a nightmare to construct and maintain. Today, a variety of vendors will supply you with high-quality systems based on rack-mounted units or blade hardware, while professional open-source software such as UniCluster Express offers high-quality, low-cost, commercially supported cluster management solutions.
Professional open source cluster software (see “Anatomy of a Cluster”) provides an integrated file system, resource management, administration, and remote access solution. This software makes it straightforward for a systems administrator to install and operate a cluster. It also makes it easy for users to run their workloads, which may comprise large numbers of single processor jobs or smaller numbers of parallel jobs that run on many processors simultaneously—or any mix of the two types. Resource management interfaces allow administrators to control the priority and response time of different job types, so that (for example) interactive jobs achieve rapid response.
Reasons given for using clusters in the IDC survey reported mentioned above include improved price-performance; better system throughput, reduced total cost of ownership, ability to run larger problems; new, more, or better results; capacity management; and improved competitiveness. Application domains in which clusters are widely used include electronic design, engineering, research, financial services, oil and gas, and biosciences. An increasing number of ISV software comes cluster-enabled.
An individual cluster can be a powerful resource for a workgroup, department, or company. Further benefits can accrue if that resource is grid-enabled. The cost effectiveness of compute clusters relative to mainframes has resulted in them being deployed in large numbers across the enterprise. Once specialized resources, high-quality system integration and professional open source software means that clusters can now be used in almost any company to deliver required computing power with low total cost of ownership.




