The tests were performed on three different architectures. Two of the architectures are Linux based clusters and one is Irix based. The IA32 cluster, Platinum[1], is a distributed memory Linux cluster. Each node in the 484 node cluster is a dual processor IBM eServer x330 thin server. They are built with Intel Pentium III 1 GHz processors which have a 256 KB full-speed L2 cache. The peak performance is therefore 1 Gflop. The nodes are connected with a high speed Myrinet 2000 switched network to support the MPI programming model. The operating system installed on the nodes is RedHat Linux version 7.2 (2.4.9 kernel). The compilers used in the tests is the Intel C compiler, with the Intel Fortran77/90/95 and C++ compilers also available.
The IA64 cluster used in these tests, Titan[2], is also a distributed memory Linux cluster. It consists of 128 IBM IntelliStation Z Pro 6894 workstations, each of which contains two Intel Itanium-I 800 MHz processors with a 4 MB full-speed L3 cache. Each node has 2 GB of memory and is capable of 3.2 Gflops per processor. Like Platinum, Titan is connected via a switched Myrinet 2000 network which is the backbone for the MPI programming model for these systems. Titan has RedHat Linux 7.1 (2.4.16 kernel) installed, as well as the Intel Fortran77/90/95, C, C++compilers.
In order to gain some perspective on the results found on the Linux clusters, the tests were also run on a SGI Origin 2000[3]. These shared memory architecture machines provides up to 128 MIPS R10000 250 MHz processors per machine, with each processor giving a peak performance of 500 Mflops. The programming model used in these tests is MPI, but other parallel programming models are available on the Origins due to the shared memory architecture. The Origins are running SGI IRIX 6.5, and use the SGI Fortran77/90/95, C, C++ compilers.