- Compilers
- Introduction
- Compiler Commands
- Memory limits with 32- and 64-bit addressing
- Useful Compiler/Linker Options
- Sample Compiler Commands
- Using cpp in Fortran Programs
- Calling Fortran routines from C/C++
- Libraries and Application Software
- Engineering Scientific Subroutine Library (ESSL)
- Mathematical Acceleration SubSystem Library (MASS)
- Locally Installed Software
- Hierarchical Data Format (HDF) Library
- Notes
- References
1. Compilers
1.1 Introduction
IBM compilers are available for Fortran 77, Fortran 90, Fortran 95, C,
and C++. The compilers can produce general-purpose and architecture-specific
optimizations to improve performance. These include loop-level optimizations,
inter-procedural analysis and cache optimizations. The compilers support
automatic and user-directed parallelization of Fortran, C, and C++ applications
for multiprocessing execution as well as both 32- and 64-bit memory addressing
modes.
1.2 Compiler Commands
IBM uses different compiler names to perform tasks which are handled
by compiler options on many other systems.
|
1 proc |
OpenMP(1) or
POSIX threads |
32-bit MPI |
64-bit MPI(2) or
MPI/OpenMP Hybrid |
| Fortran 77 |
xlf |
xlf_r |
mpxlf |
mpxlf_r |
| Fortran 90 |
xlf90, f90 |
xlf90_r |
mpxlf90 |
mpxlf90_r |
| Fortran 95 |
xlf95, f95 |
xlf95_r |
mpxlf95 |
mpxlf95_r |
| ANSI C |
xlc |
xlc_r |
mpcc |
mpcc_r |
| Extended C |
cc |
cc_r |
mpcc |
mpcc_r |
| C++ |
xlC |
xlC_r |
mpCC |
mpCC_r |
(1) OpenMP also requires the option:
-qsmp=omp
(2) 64-bit MPI also requires the options:
-q64 -qwarn64
1.3 Memory limits with 32- and 64-bit addressing
By default, programs are compiled in 32-bit mode and can access up to
256 Mbytes of memory. If you use option -bmaxdata:0x80000000 (linker option) to
create the executable, the program can access up to 2 Gbytes of memory.
To access more than 2 Gbytes of memory, you will need to use 64-bit addressing.
The easiest way to use 64-bit addressing is to set the value of environment
variable $OBJECT_MODE to 64:
In tcsh: setenv OBJECT_MODE 64
In ksh: export OBJECT_MODE=64
Then recompile and relink your program.
Alternatively, you add the following options
to your compile line to use 64-bit addressing:
-q64 -qwarn64
Use the following options when using 64-bit objects in other utilities:
| Command(s) |
Option |
| ar, ranlib, lorder |
-X 64 |
| ld |
-b64 |
Other than the memory limits inherent in 32- and 64-bit addressing, there
are also memory limits for interactive processes and batch jobs. For more
information, see
Running Jobs.
1.4 Useful Compiler/Linker Options
Compatibility Options
- -qautodbl=dbl4
- Promotes floating-point objects that are single precision to 8 bytes.
Note that REAL*4 is also promoted to 8 bytes. (default is 4 bytes). (Fortran)
- -qrealsize=<bytes>
- Sets the size of the default REAL, DOUBLE PRECISION, COMPLEX and DOUBLE
COMPLEX values. The allowed values for <bytes> are 4 and 8. (Fortran)
- -qintsize=<bytes>
- Sets the size of default INTEGER and LOGICAL values. (default is 4
bytes) (Fortran)
- -qsave (-qnosave)
- Sets the default storage class for local variables to STATIC (AUTOMATIC).
(Fortran)
Note: -qnosave is the default except for the xlf compiler.
- -qcpluscmt
- Permit "//" to introduce a comment that lasts until the end of the
current source line, as in C++.
Detecting Programming Errors
- -g
- Includes debugger information in the object files. -g works
with all levels of optimization!
- -C or -qcheck
- array bounds checking (will slow execution--use only for debugging)
- -qflttrap=enable:zerodivide:invalid:overflow -qsigtrap -g
- detect floating point exceptions (will slow execution--use only for
debugging)
- -qsigtrap
- Allows you to get a traceback of your program when an error occurs.
Use this option with -g, -C, and -qflttrap.
By default, you just get a core dump when an error occurs.
Available exception handlers are:
- -qsigtrap=xl__ieee
- Produces a traceback and an explanation of the signal and continuesexecution
by supplying the default IEEE result for the failed computation.
- -qsigtrap=xl__trce
- Produces a traceback and stops the program.
- -qsigtrap=xl__trcedump
- Produces a traceback and a core file and stops the program.
- -qinitauto
- initializes all variables to zero at compile time
- -qextchk
- Checks for mismatched procedure interfaces and common blocks. This
option cannot be used with MPI because MPI relies on weak type checking and
mismatched procedure interfaces.
- -qheapdebug
- Enables debug versions of memory management functions like malloc()
and free() in c or c++ code (will slow execution--use only for debugging).
Example
- -u
- equivalent to having IMPLICIT NONE throughout the code (Fortran)
Larger Memory
- -bmaxdata:0x80000000
- enlarges memory limit of 32-bit programs from 256 MBytes to 2 GBytes.
Use this option only for 32-bit programs. (linker option)
- -q64 -qwarn64
- 64-bit executable (if program needs to address more than 2 Gbytes of
memory)
Optimization
- -qnoopt
- performs no optimization (or -O0 for Fortran only)
- -O or -O2
- basic optimzation
- -O3
- performs the -O level optimizations and performs additional
optimizations that are memory or compile time intensive.
- -O4
- aggressively optimizes the source program, trading off additional compile
time for potential improvements in the generated code. This option implies
the following options: -qarch=auto -qtune=auto -qcache=auto -qhot -qipa
- -O5
- same as -O4, but also implies the -qipa=level=2
option.
- -qstrict
- ensures that optimizations done by the -O3 and higher optimization
options do not alter the semantics of a program.
- -qarch=auto -qtune=auto -qcache=auto
- optimize for local machine (only needed with -O3 and below)
- -qhot
- determines whether or not to perform high-order transformations on
loops during optimization (implied with -O4 and -O5).
- -qipa
- inter-procedural analysis to perform optimizations across procedures
(implied with -O4 and -O5). See the compiler man
pages for more details on ipa options and syntax.
- -Q
- compiler decides what functions to inline
- -Q+func1:func2
- only inline specified functions
OpenMP and Automatic Parallelization
- -qsmp=omp
- OpenMP program (similar to -mp on Origin2000). You must
also use the thread-safe compiler command (ending with _r) when compiling
OpenMP programs.
- -qsmp
- automatic parallelization (similar to -apo on Origin2000)
- -qsmp -qreport=smplist
- automatic parallelization with a detailed listing (similar to -apo
list on Origin2000)
Other Options
- -qsuffix=f=f90
- compile source files with .f90 suffix. (not needed if f90 command is
used)
- -qipa=partition=large
- use this if you get messages about partition sizes with -O4
or -qipa
- -V
- display verbose information of each compiler component that is executed
in shell-executable format.
1.5 Sample Compile Commands
| Program Type |
Example Compile Command |
| Fortran 77 serial program |
xlf -o testprog testprog.f |
| Fortran 90 with automatic parallelization |
xlf90_r -qsmp -o testprog testprog.f |
| Fortran 90 OpenMP |
xlf90_r -qsmp=omp -o testprog testprog.f |
| Fortran 95 32-bit MPI |
mpxlf95 -o testprog testprog.f |
| Fortran 77 64-bit MPI |
mpxlf_r -q64 -qwarn64 -o testprog testprog.f |
| C serial program |
xlc -o testprog testprog.c |
| C++ with OpenMP |
xlC_r -qsmp=omp -o testprog testprog.C |
| C with 32-bit MPI |
mpcc -o testprog testprog.c |
| C with 64-bit MPI |
mpcc_r -q64 -qwarn64 -o testprog testprog.c |
| C++ with MPI/OpenMP Hybrid |
mpCC_r -o testprog testprog.C |
1.6 Using cpp in Fortran Programs
If the source file ends with .F, it is automatically preprocessed
by cpp before it is compiled. If you want to use the cpp processor with
source files that do not end with .F, use the following compiler
option to specify the filename suffix:
- -qsuffix=cpp=<suffix>
- Where <suffix> represents the new preprocessed-source-file-suffix.
For example, to preprocess source files that end with .F90:
xlf95 qsuffix=cpp=F90 prog.F90
To pass options to cpp, use -WF,<option>. For example,
to define name AIXV4:
xlf90 -WF,-DAIXV4 conditional.F
1.7 Calling Fortran routines from C/C++
When calling your own Fortran routines from C/C++, you should not
add an underscore (_) after the name.
The following common Fortran routines are available for use from C/C++,
but you must add an underscore (_) after the name:
alarm_
clock_
ctime_
|
dtime_
etime_
fdate_
|
flush_
gmtime_
idate_
|
itime_
ltime_
sleep_
|
time_
usleep_ |
A complete list of routines is in the
XL Fortran Language Reference Manual.
If modifying the source code is too difficult, you can use the linker
option -brename to rename the routines:
- -brename:Symbol,NewName
- Renames the external symbol Symbol to NewName.
In effect, it is as if all definitions and references to Symbol in all object
files were renamed to NewName before the files were processed.
Examples:
- If your code calls dtime and flush, you can add
the following to rename those calls to the AIX equivalent:
-brename:.dtime,.dtime_ -brename:.flush,.flush_
- For your own Fortran routine myfortran that has been called
in your code with the underscore after the name:
-brename:.myfortran_,.myfortran
Note: The period "." before the filename is required.
You can also write wrapper routines as an alternative to modifying your
code. For example:
subroutine flush(unit)
integer(4) unit
call flush_(unit)
end
2. Libraries and Application Software
2.1 Engineering Scientific Subroutine Library (ESSL)
IBM's Engineering Scientific Subroutine Library (ESSL) is a collection
of subroutines providing a wide range of mathematical functions for many
different scientific and engineering applications. It includes Basic Linear
Algebra Subprograms (BLAS) level 1, 2, and 3, and a subset of the LAPACK
library.
The mathematical subroutines are in nine computational areas:
- Linear Algebra Subprograms
- Matrix Operations
- Linear Algebraic Equations
- Eigensystem Analysis
- Fourier Transforms, Convolutions and Correlations, and Related Computations
- Sorting and Searching
- Interpolation
- Numerical Quadrature
- Random Number Generation
The following table lists the linker options to use for various program
types:
| Program Type |
link with... |
| serial program |
-lessl |
| OpenMP or POSIX Threads |
-lesslsmp |
| 32-bit MPI |
-lpessl -lessl -lblacs |
| 64-bit MPI |
-lpesslsmp -lesslsmp -lblacssmp |
Since ESSL does not include all the routines in LAPACK, you may have
to link to the complete version of LAPACK locally
installed by NCSA to get any missing routines. Please add the following
to the end of your link options.
For 32-bit programs: -L/usr/local/apps/math/lapack/LAPACK
-llapack-SP4_32
For 64-bit programs: -L/usr/local/apps/math/lapack/LAPACK -llapack-SP4_64
2.2 Mathematical Acceleration SubSystem Library (MASS)
IBM's Mathematical Acceleration SubSystem Library (MASS) is a set of
subroutines for computation of mathematical functions that may provide
improved performance for certain FORTRAN and C intrinsic functions over
those in the conventional IBM libraries.
To link with MASS from a Fortran program, use linker option:
-lmass
To link with MASS from a C or C++ program, use linker option:
-lmass -lm
If parts of your code can be vectorized and you are willing to make some
changes to the program, please read the section of the MASS documentation
on the more efficient vector version of the MASS library: libmassv
2.3 Locally Installed Software
NCSA has a variety of applications software available in different areas
of concentration, including chemistry, computational fluid dynamics, mathematics,
solid mechanics, and vizualization. An online list is maintained in the
Software
Repository. The list is organized by system and by area of science. If
you have any questions about available software, contact the appropriate
software coordinator or the staff in Consulting Services.
The /usr/apps directory contains third party applications with
most located in subdirectories according to the applications areas.
2.4 Hierarchical Data Format (HDF) Library
HDF is a library and platform independent data format for the storage
and exchange of scientific data. It includes Fortran and C calling interfaces,
and utilities for analyzing and converting HDF data files.
There are two HDF formats,
HDF (4.x and previous releases) and HDF5. These formats are completely different
and NOT compatible.
HDF/HDF5 is available on NCSA HPC systems via the TeraGrid
Coordinated TeraGrid Software and Services (CTSS). Use
SoftEnv for information on accessing the software.
Information on support is available at the HDF Support Issues page.
The HDF Home Page has detailed
information on HDF and HDF5, including documentation, tutorials, examples,
and FAQs.
3. Notes
-
MPI C++ bindings
To use the MPI C++ bindings, you must:
- define _MPI_CPP_BINDINGS
- compile with mpCC_r, since some types used in the C++ bindings
for MPI are only defined in thread-safe mode.
-
Endian Conversion
The p690 and the Origin2000 machines both use big endian, but the
Intel clusters use little endian. So if you are moving unformatted data
files between the p690 and the clusters, you'll need to convert them. Write
a small program on the clusters using the instructions in the cluster
documentation to do the conversion.
-
Transitioning from the NCSA Origin2000
If you are porting your code from the NCSA Origin2000 system, the
document Transition
from the Origin2000 may be helpful.
4. References
The latest compiler documentation for AIX is at the IBM AIX
compiler information center.
IBM documentation is provided locally at NCSA at
http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/IBMp690/IBM/.