Grid.org Home     Univa UD Home     |    Download    Forums    Wiki    FAQs     Why Join?     Register
Remember Me

Anatomy of a Cluster Software Stack

This document provides a brief introduction to the software stack that turns a rack of servers into an integrated cluster and grid platform.

Problem Statement

A 1991 New York Times article predicted that the rapid improvement of commodity processors and storage would spell doom for special purpose supercomputers. Indeed, in the intervening years this “attack of the killer micros” has come to pass, and scale-out platforms based on commodity Intel and AMD chip sets are now ubiquitous.

There is, however, a distinct difference between a rack full of blade servers and an integrated cluster platform that can deliver performance and value to end applications. The key is to integrate the individual blades, or nodes, in order to aggregate resources from a computing and data perspective.

Fortunately, the missing piece is not more hardware, but rather deployment of the appropriate software stack which enables the individual blades in the rack (or multiple racks) to be managed and used as a scalable compute or data platform.

Cluster Software Stack Elements

The figure below illustrates the basic elements of a typical software stack for a cluster. This software stack is structured in three layers. We will describe each layer, starting from the bottom.

Operating System

At the bottom of the stack is the operating system. Each node (or blade) in the cluster runs its own copy of the operating system. Often, a version of Linux is used for the node operating system. However, clusters can be build on other operating systems, such as Windows. It is important to know that while having the same version of the operating system on each node can simplify cluster management, it is not required and you can build a cluster out of different operating systems.

Software Components

The next layer up in the picture is the software components that actually integrate the individual nodes into a cluster platform. These components are:

  • A job scheduler. The role of a job scheduler, such as Grid Engine, is to take requests in the form of “run application A on N computing nodes” and cause them to execute on an appropriate subset of the nodes available on the cluster. The job scheduler figures out when to run each submitted job based on the resource requirements of the request, number of nodes in the cluster, requests already in the system, cluster policy and priorities, etc.
  • A file system. Cluster nodes are typically configured with some amount of local disk, but typically applications will require access to data that cannot be stored on the node. To address this, cluster nodes usually are given access to a remote file system via software such as the network file system (NSF). Often performance requirements of a scalable application exceed the capabilities of NSF. In these situations, so-called parallel or cluster file systems such as the open source Lustre file system can be used.
  • Security requirements apply to cluster nodes just as for individual workstations. Frequently, the same security elements that are used for enterprise security, such as LDAP, NIS, or basic using account security, are used for the cluster nodes as well. However, because clusters are often used as a shared resource across workgroups, some form of security overlay such as the Grid Security Infrastructure provided by the Globus Toolkit is used to enable remote access end enhance the utility of the cluster.
  • Monitoring. A typical cluster can contain anywhere from a few nodes up to thousands of nodes. In a small cluster, it may be possible to log on to the operating system for each node to see what jobs are running, how the node is performing, etc. However, for even a small cluster, this quickly becomes impractical. For this reason, an integrated monitoring framework that collects and reports operating statistics on the cluster as a whole are essential. A monitoring system such as the open source Ganglia software provides this capability, allowing the cluster administrator and cluster users to see at a glance what is going on within the cluster, and to aid in problem detection and troubleshooting.

 

Remote Access

A cluster can represent a significant capital expenditure, and as such it may need to be shared across workgroups. While tools such as SSH and remote desktops can be used to enable remote users to submit jobs to a cluster, these are often awkward to use (especially if the task requires moving data that may be located in the remote user’s environment, returning output data from the remote cluster, or being integrated into a workflow or script that may run partially in the remote user’s environment).

These problems are eliminated by the top layer in the cluster stack, which provides standard network interfaces for submitting and managing a task on a remote cluster. Remote access services, such as the Globus Toolkits GRAM service, provide data staging, job submission, and job management through a standard web services interface and command line tools built on those interfaces.

Summary

As we highlighted in the above discussion, all of the required elements of the cluster software stack can be provided by proven open source solutions. When deployed and integrated onto a networked collection of commodity servers, they can create a high performance and cost effective scalable compute and data platform that is capable of solving problems of scale and complexity that were out of reach just a few years ago.



Print    Email