NCSA Home
Contact Us | Intranet | Search

Programming Environment on NCSA IBM pSeries 690

 
  1. Compilers
    1. Introduction
    2. Compiler Commands
    3. Memory limits with 32- and 64-bit addressing
    4. Useful Compiler/Linker Options
    5. Sample Compiler Commands
    6. Using cpp in Fortran Programs
    7. Calling Fortran routines from C/C++
  2. Libraries and Application Software
    1. Engineering Scientific Subroutine Library (ESSL)
    2. Mathematical Acceleration SubSystem Library (MASS)
    3. Locally Installed Software
    4. Hierarchical Data Format (HDF) Library
  3. Notes
  4. References

1. Compilers

1.1 Introduction

IBM compilers are available for Fortran 77, Fortran 90, Fortran 95, C, and C++. The compilers can produce general-purpose and architecture-specific optimizations to improve performance. These include loop-level optimizations, inter-procedural analysis and cache optimizations. The compilers support automatic and user-directed parallelization of Fortran, C, and C++ applications for multiprocessing execution as well as both 32- and 64-bit memory addressing modes.

1.2 Compiler Commands

IBM uses different compiler names to perform tasks which are handled by compiler options on many other systems.


1 proc OpenMP(1) or
POSIX threads
32-bit MPI 64-bit MPI(2) or
MPI/OpenMP Hybrid
Fortran 77 xlf xlf_r mpxlf mpxlf_r
Fortran 90 xlf90, f90 xlf90_r mpxlf90 mpxlf90_r
Fortran 95 xlf95, f95 xlf95_r mpxlf95 mpxlf95_r
ANSI C xlc xlc_r mpcc mpcc_r
Extended C cc cc_r mpcc mpcc_r
C++ xlC xlC_r mpCC mpCC_r
(1) OpenMP also requires the option: -qsmp=omp
(2) 64-bit MPI also requires the options: -q64 -qwarn64

1.3 Memory limits with 32- and 64-bit addressing

By default, programs are compiled in 32-bit mode and can access up to 256 Mbytes of memory. If you use option -bmaxdata:0x80000000 (linker option) to create the executable, the program can access up to 2 Gbytes of memory. To access more than 2 Gbytes of memory, you will need to use 64-bit addressing. The easiest way to use 64-bit addressing is to set the value of environment variable $OBJECT_MODE to 64:

In tcsh: setenv OBJECT_MODE 64

In ksh: export OBJECT_MODE=64

Then recompile and relink your program. Alternatively, you add the following options to your compile line to use 64-bit addressing:

   -q64 -qwarn64

Use the following options when using 64-bit objects in other utilities:

Command(s) Option
ar, ranlib, lorder -X 64
ld -b64

Other than the memory limits inherent in 32- and 64-bit addressing, there are also memory limits for interactive processes and batch jobs. For more information, see Running Jobs.

1.4 Useful Compiler/Linker Options

Compatibility Options
-qautodbl=dbl4
Promotes floating-point objects that are single precision to 8 bytes. Note that REAL*4 is also promoted to 8 bytes. (default is 4 bytes). (Fortran)
-qrealsize=<bytes>
Sets the size of the default REAL, DOUBLE PRECISION, COMPLEX and DOUBLE COMPLEX values. The allowed values for <bytes> are 4 and 8. (Fortran)
-qintsize=<bytes>
Sets the size of default INTEGER and LOGICAL values. (default is 4 bytes) (Fortran)
-qsave (-qnosave)
Sets the default storage class for local variables to STATIC (AUTOMATIC). (Fortran)
Note: -qnosave is the default except for the xlf compiler.
-qcpluscmt
Permit "//" to introduce a comment that lasts until the end of the current source line, as in C++.
Detecting Programming Errors
-g
Includes debugger information in the object files. -g works with all levels of optimization!
-C or -qcheck
array bounds checking (will slow execution--use only for debugging)
-qflttrap=enable:zerodivide:invalid:overflow -qsigtrap -g
detect floating point exceptions (will slow execution--use only for debugging)
-qsigtrap
Allows you to get a traceback of your program when an error occurs. Use this option with -g, -C, and -qflttrap. By default, you just get a core dump when an error occurs.

Available exception handlers are:

-qsigtrap=xl__ieee
Produces a traceback and an explanation of the signal and continuesexecution by supplying the default IEEE result for the failed computation.
-qsigtrap=xl__trce
Produces a traceback and stops the program.
-qsigtrap=xl__trcedump
Produces a traceback and a core file and stops the program.
-qinitauto
initializes all variables to zero at compile time
-qextchk
Checks for mismatched procedure interfaces and common blocks. This option cannot be used with MPI because MPI relies on weak type checking and mismatched procedure interfaces.
-qheapdebug
Enables debug versions of memory management functions like malloc() and free() in c or c++ code (will slow execution--use only for debugging). Example
-u
equivalent to having IMPLICIT NONE throughout the code (Fortran)
Larger Memory
-bmaxdata:0x80000000
enlarges memory limit of 32-bit programs from 256 MBytes to 2 GBytes. Use this option only for 32-bit programs. (linker option)
-q64 -qwarn64
64-bit executable (if program needs to address more than 2 Gbytes of memory)
Optimization
-qnoopt
performs no optimization (or -O0 for Fortran only)
-O or -O2
basic optimzation
-O3
performs the -O level optimizations and performs additional optimizations that are memory or compile time intensive.
-O4
aggressively optimizes the source program, trading off additional compile time for potential improvements in the generated code. This option implies the following options: -qarch=auto -qtune=auto -qcache=auto -qhot -qipa
-O5
same as -O4, but also implies the -qipa=level=2 option.
-qstrict
ensures that optimizations done by the -O3 and higher optimization options do not alter the semantics of a program.
-qarch=auto -qtune=auto -qcache=auto
optimize for local machine (only needed with -O3 and below)
-qhot
determines whether or not to perform high-order transformations on loops during optimization (implied with -O4 and -O5).
-qipa
inter-procedural analysis to perform optimizations across procedures (implied with -O4 and -O5). See the compiler man pages for more details on ipa options and syntax.
-Q
compiler decides what functions to inline
-Q+func1:func2
only inline specified functions
OpenMP and Automatic Parallelization
-qsmp=omp
OpenMP program (similar to -mp on Origin2000). You must also use the thread-safe compiler command (ending with _r) when compiling OpenMP programs.
-qsmp
automatic parallelization (similar to -apo on Origin2000)
-qsmp -qreport=smplist
automatic parallelization with a detailed listing (similar to -apo list on Origin2000)
Other Options
-qsuffix=f=f90
compile source files with .f90 suffix. (not needed if f90 command is used)
-qipa=partition=large
use this if you get messages about partition sizes with -O4 or -qipa
-V
display verbose information of each compiler component that is executed in shell-executable format.

1.5 Sample Compile Commands

Program Type Example Compile Command
Fortran 77 serial program xlf -o testprog testprog.f
Fortran 90 with automatic parallelization xlf90_r -qsmp -o testprog testprog.f
Fortran 90 OpenMP xlf90_r -qsmp=omp -o testprog testprog.f
Fortran 95 32-bit MPI mpxlf95 -o testprog testprog.f
Fortran 77 64-bit MPI mpxlf_r -q64 -qwarn64 -o testprog testprog.f
C serial program xlc -o testprog testprog.c
C++ with OpenMP xlC_r -qsmp=omp -o testprog testprog.C
C with 32-bit MPI mpcc -o testprog testprog.c
C with 64-bit MPI mpcc_r -q64 -qwarn64 -o testprog testprog.c
C++ with MPI/OpenMP Hybrid mpCC_r -o testprog testprog.C

1.6 Using cpp in Fortran Programs

If the source file ends with .F, it is automatically preprocessed by cpp before it is compiled. If you want to use the cpp processor with source files that do not end with .F, use the following compiler option to specify the filename suffix:

-qsuffix=cpp=<suffix>
Where <suffix> represents the new preprocessed-source-file-suffix.

For example, to preprocess source files that end with .F90:

    xlf95 qsuffix=cpp=F90 prog.F90

To pass options to cpp, use -WF,<option>. For example, to define name AIXV4:

    xlf90 -WF,-DAIXV4 conditional.F

1.7 Calling Fortran routines from C/C++

When calling your own Fortran routines from C/C++, you should not add an underscore (_) after the name.

The following common Fortran routines are available for use from C/C++, but you must add an underscore (_) after the name:

alarm_
clock_
ctime_
dtime_
etime_
fdate_
flush_
gmtime_
idate_
itime_
ltime_
sleep_
time_
usleep_

A complete list of routines is in the XL Fortran Language Reference Manual.

If modifying the source code is too difficult, you can use the linker option -brename to rename the routines:

-brename:Symbol,NewName
Renames the external symbol Symbol to NewName. In effect, it is as if all definitions and references to Symbol in all object files were renamed to NewName before the files were processed.

Examples:

  • If your code calls dtime and flush, you can add the following to rename those calls to the AIX equivalent:

       -brename:.dtime,.dtime_ -brename:.flush,.flush_
  • For your own Fortran routine myfortran that has been called in your code with the underscore after the name:
       -brename:.myfortran_,.myfortran 
Note: The period "." before the filename is required.

You can also write wrapper routines as an alternative to modifying your code. For example:

       subroutine flush(unit)
integer(4) unit
call flush_(unit)
end

2. Libraries and Application Software

2.1 Engineering Scientific Subroutine Library (ESSL)

IBM's Engineering Scientific Subroutine Library (ESSL) is a collection of subroutines providing a wide range of mathematical functions for many different scientific and engineering applications. It includes Basic Linear Algebra Subprograms (BLAS) level 1, 2, and 3, and a subset of the LAPACK library.

The mathematical subroutines are in nine computational areas:

  • Linear Algebra Subprograms
  • Matrix Operations
  • Linear Algebraic Equations
  • Eigensystem Analysis
  • Fourier Transforms, Convolutions and Correlations, and Related Computations
  • Sorting and Searching
  • Interpolation
  • Numerical Quadrature
  • Random Number Generation

The following table lists the linker options to use for various program types:

Program Type link with...
serial program -lessl
OpenMP or POSIX Threads -lesslsmp
32-bit MPI -lpessl -lessl -lblacs
64-bit MPI -lpesslsmp -lesslsmp -lblacssmp

Since ESSL does not include all the routines in LAPACK, you may have to link to the complete version of LAPACK locally installed by NCSA to get any missing routines. Please add the following to the end of your link options.

For 32-bit programs: -L/usr/local/apps/math/lapack/LAPACK -llapack-SP4_32

For 64-bit programs: -L/usr/local/apps/math/lapack/LAPACK -llapack-SP4_64

2.2 Mathematical Acceleration SubSystem Library (MASS)

IBM's Mathematical Acceleration SubSystem Library (MASS) is a set of subroutines for computation of mathematical functions that may provide improved performance for certain FORTRAN and C intrinsic functions over those in the conventional IBM libraries.

To link with MASS from a Fortran program, use linker option:

   -lmass

To link with MASS from a C or C++ program, use linker option:

   -lmass -lm

If parts of your code can be vectorized and you are willing to make some changes to the program, please read the section of the MASS documentation on the more efficient vector version of the MASS library: libmassv

2.3 Locally Installed Software

NCSA has a variety of applications software available in different areas of concentration, including chemistry, computational fluid dynamics, mathematics, solid mechanics, and vizualization. An online list is maintained in the Software Repository. The list is organized by system and by area of science. If you have any questions about available software, contact the appropriate software coordinator or the staff in Consulting Services. The /usr/apps directory contains third party applications with most located in subdirectories according to the applications areas.

2.4 Hierarchical Data Format (HDF) Library

HDF is a library and platform independent data format for the storage and exchange of scientific data. It includes Fortran and C calling interfaces, and utilities for analyzing and converting HDF data files. There are two HDF formats, HDF (4.x and previous releases) and HDF5. These formats are completely different and NOT compatible.

HDF/HDF5 is available on NCSA HPC systems via the TeraGrid Coordinated TeraGrid Software and Services (CTSS). Use SoftEnv for information on accessing the software. Information on support is available at the HDF Support Issues page.

The HDF Home Page has detailed information on HDF and HDF5, including documentation, tutorials, examples, and FAQs.

3. Notes

  1. MPI C++ bindings

    To use the MPI C++ bindings, you must:

    • define _MPI_CPP_BINDINGS
    • compile with mpCC_r, since some types used in the C++ bindings for MPI are only defined in thread-safe mode.

  2. Endian Conversion

    The p690 and the Origin2000 machines both use big endian, but the Intel clusters use little endian. So if you are moving unformatted data files between the p690 and the clusters, you'll need to convert them. Write a small program on the clusters using the instructions in the cluster documentation to do the conversion.

  3. Transitioning from the NCSA Origin2000

    If you are porting your code from the NCSA Origin2000 system, the document Transition from the Origin2000 may be helpful.

4. References

The latest compiler documentation for AIX is at the  IBM AIX compiler information center.

IBM documentation is provided locally at NCSA at http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/IBMp690/IBM/.