NCSA Home
Contact Us | Intranet | Search

Totalview Debugger

TotalView Webinars
Debugging Threads with TotalView
3:00 p.m. EDT on Tuesday, September 25, 2007

Introduction to TotalView
12:00 p.m. EDT on Friday, October 05, 2007

The Webinar is ~one hour live interactive web demonstration designed to help people new to TotalView become more comfortable with the features and benefits. We understand that you have specific things you need to do with a debugger; let us help you get started using TotalView.

You can register at: http://www.totalviewtech.com/Try/webinar_enroll.php

Table of Contents
  1. Overview
  2. TotalView on NCSA Linux Clusters
  3. TotalView on NCSA Shared Memory Machines
  4. General TotalView usage
  5. Using the command line interface (CLI)
  6. Enabling Memory Debugging

1. Overview

TotalView is a full-featured, source-level, graphical debugger for C, C++, and Fortran (77 and 90), assembler, and mixed source/assembler codes based on the X Window System from Etnus. TotalView supports MPI, PVM and HPF.

Information on TotalView is available in the release notes and user guide at the Etnus Online Documentation page, as well as in the /usr/apps/tools/toolworks/totalview/doc directory. Also see "man totalview" for command syntax and options.

Note: In order to use TotalView, you must be using a terminal or workstation capable of displaying X Windows. See Using the X Window System for more information.

2. TotalView on Linux Clusters

TotalView is available on NCSA's Linux Clusters. There are 4 TotalView licenses for jobs up to 8 processes, and 1 license for jobs up to 128 processes. We do not currently have a way to guarantee you will get a license when your job starts if you run in batch.

GNU and Intel compilers are supported.

Important: For both compilers you need to compile and link your code with -g to enable source code listing within TotalView.

2.1 Tungsten

Before you begin
  1. Compile and link your code with the compiler/linker flags '-O0 -g' to provide symbolic debug information and predictable TotalView behavior with the Intel  compilers.
  2. Add +totalview to your .soft file in your HOME directory and issue the resoft command. This will add TotalView to your environment.
  3. Make sure you have your DISPLAY environment variable set correctly. See the discussion on Using the X11 Windows System and/or Running from an interactive batch session

Serial Code Debugging

If the memory and CPU time requirements of your code fit within the limits of a shell on the front-end host (tun[abcde]), you can run TotalView directly from the host. If not, you will need to run on a compute node via an interactive batch session.

From a Tungsten front-end host you start the TotalView debugging session with the following command:

% totalview ./program.exe [program args]

If you do not see the TotalView control window, you may need to consult the Using the X Window System page.

MPI Debugging with CMPI

First, start an interactive batch session with the number of nodes/processes needed for debugging your application.

Once you are ready to debug, ChaMPIonPro provides a TotalView switch to the cmpirun script:

% cmpirun -lsf -tv ./program.exe [program args]

If you are not running from within the interactive batch shell:

% cmpirun -np $NPROCS -machinefile $LSB_NODEFILE -tv ./program.exe [program args]

where $NPROCS and $LSB_NODEFILE are defined in the interactive batch shell.

MPI Debugging MPICH-GM

First, start an interactive batch session with the number of nodes/processes needed for debugging your application.

Once you are ready to debug, MPICH-GM provides a TotalView switch to the mpirun script.

% mpirun -np $NPROCS -machinefile $LSB_NODEFILE -tv ./program.exe [program args]

or you may try

% mpirun.ch_gm DISPLAY=$DISPLAY -np $NPROCS -machinefile $LSB_NODEFILE -totalview ./program.exe [program args]

Running from an LSF Interactive batch session

For an interactive batch session you need to specify the number of compute nodes and the amount of wall clock time you will need them. The example below asks for 2 compute nodes with 2 processes per node for 30 minutes.

% bsub -Is -n4 -W00:30 $SHELL

When the session begins, it will startup the shell specified by $SHELL.

There are two options once you have a session started: using X11 tunneling with ssh or setting environment variables. The preferred way is to use X11 tunnelling with ssh but the LSF batch system does not use ssh to put the user on the compute node.

Setting Environment Variables This is described in the Using the X Window System page. In this mode you need to set your DISPLAY variable to the X display of your local machine.

X11 tunneling with ssh First, determine the name of the launch host for the interactive batch session above by typing hostname in the window that has the interactive batch session in progress. You should also echo the environment variables LSB_NODEFILE and NPROCS which will be used below.

From another Tungsten login session, logon to the launch host via ssh with tunnelling enabled.

% ssh -X launch_host

Once connected to the launch host, change to the directory from where you run your application. Next, set the environment variables LSB_NODEFILE and NPROCS to those reported from the interactive batch session.

2.2 TG-NCSA (Mercury)

Issue with GPFS and the 2.4 kernel (2008)

An issue with using TotalView to debug MPI applications where the executable resides on a GPFS filesystem and/or the MPICH-GM libraries are dynamically linked to versions which reside on a GPFS filesystem appeared after the transition of all filesystems to GPFS in late '07.

Additional steps have been added below to work around this issue by statically linking MPICH-GM libraries and by placing the application executable and input data on the node local scratch for each node in the debug session.

Before you begin
  1. Compile and link your code with the compiler/linker flags '-O0 -g' to provide symbolic debug information and predictable TotalView behavior with the Intel  compilers. If you are using MPICH-GM and the mpicc, mpicxx, mpif77 and mpif90 compiler scripts, you need to set the environment variable MPICH_USE_SHLIB to no before linking. This can be done in your .soft file by adding the line:
    MPICH_USE_SHLIB=no
    or if you are NOT using the compiler scripts and prefer to add the libraries needed for linking:
    f77/f90: -L $MPICH_GM_HOME/lib -lmpich -lpmpich -L $GM_HOME/lib -lgm -lpthread
    c: -L $MPICH_GM_HOME/lib -lmpich -lpmpich -L $GM_HOME/lib -lgm -lpthread
    c++: -L $MPICH_GM_HOME/lib -lpmpich++ -lmpich -lpmpich -L $GM_HOME/lib -lgm -lpthread

    Doing ldd executablename should not show any MPICH-GM libraries (GM libs are ok).
  2. Add +totalview to your .soft file in your HOME directory and issue the resoft command. This will add TotalView to your environment.
  3. Make sure you have your DISPLAY environment variable set correctly. See the discussion on Using the X11 Windows System and/or Running from an interactive batch session

Serial Code Debugging

If the memory requirements of your code fit within the limits of a shell on the front-end host (tg-login), you can run TotalView directly. If not, you will need to run on a compute node via an interactive batch session.

From the tg-login front-end host you start the TotalView debugging session with the following command

% totalview ./program.exe [program args]

If you do not see the TotalView process manager window, you should first consult the Using the X Window System page.

MPI Debugging MPICH-GM

First, start an interactive batch session with the number of nodes/processes needed for debugging your application. This will put you onto the launch host for the job.

As mentioned above, your application executable needs to be placed on a non-GPFS filesystem which in this case is the local scratch disk on each node (/scr). To efficiently copy the executable (and any other input files needed by the executable), use pbsdsh as follows from within the launch host of the interactive job:

pbsdsh -u cp $HOME/path/to/my/executablec /scr/$PBS_JOBID
pbsdsh -u cp $HOME/path/to/my/inputdata /scr/$PBS_JOBID
cd /scr/$PBS_JOBID

If you are not using ssh X11 tunnelling but are setting DISPLAY to be a direct access to your local Xserver then you can skip down to mpirun -np XX ...., otherwise do the following steps.

cp $PBS_JOBID machinefile
hostname

and from another login session on tg-login.ncsa.teragrid.org:

ssh -X launch hostname
cd /scr/pbs jobid from above

and then start mpirun as below with $PBS_NODEFILE replaced with machinefile as needed.

Once you are ready to debug, MPICH-GM provides a TotalView switch to the mpirun script.

% mpirun -np XX -machinefile $PBS_NODEFILE -tv ./program.exe [program args]

where XX is the number of processes needed.

MPI Debugging MPICH-VMI

First, start an interactive batch session with the number of nodes/processes needed for debugging your application.

Follow the steps in the MPICH-GM above description to copy your executable and input files to local scratch (/scr) and then proceed with the following steps.

Once you are ready to debug, MPICH-VMI provides a TotalView switch to the mpirun script that enables the 'attach to a paused process method'. This is different than the other methods of using Totalview with an MPI application but just as valid.

% mpirun -np XX -machinefile $PBS_NODEFILE -debugger totalview ./program.exe [program args]

where XX is the number of processes needed. mpirun with then report that the process is waiting for 300 seconds and also provide the process PID of the application that totalview should attach to.

In another window, ssh (with -X) to the host machine that your ran mpirun on. Then change directory to the location where you ran your application. Finally:

% setenv LD_LIBRARY_PATH /opt/vmi-2.0.1-1-gcc/lib:$LD_LIBRARY_PATH
% totalview -pid PID ./program.exe

and totalview will start-up, attach the process given by PID and use ./program.exe to get the symbol information from.

Running from a PBS Interactive batch session

For an interactive batch session you need to specify the number of compute nodes and the amount of wall clock time you will need them. The example below asks for 2 compute nodes with 2 processes per node for 30 minutes.

% qsub -I -V -lwalltime=00:30:00 -lnodes=2:ppn=2
When the session begins, it will startup a shell on launch node.

There are two options once you have a session started: using X11 tunneling with ssh or setting environment variables. The preferred way is to use X11 tunnelling with ssh but the PBS batch system does not use ssh to put the user on the compute node.

Setting Environment Variables directly is described in the Using the X Window System page. In this mode you need to set your DISPLAY variable to the X display of your local machine.

The other option is to use X11 tunneling with ssh. Find the name of the launch host for the interactive batch session above by typing hostname in the window that has the interactive batch session in progress. You should also echo the environment variable $PBS_NODEFILE  which will be used below.

From another TG-NCSA login session, logon to the launch host via ssh with tunnelling enabled.

% ssh -X launch_host

Once connected to the launch host, change to the directory from where you run your application. Next, set the environment variable PBS_NODEFILE  to that reported from the interactive batch session.

2.3 Abe (Intel 64 Linux Cluster)

TotalView will be supported on Abe. This support is currently in-progress.

Before you begin
  1. Compile and link your code with the compiler/linker flags '-O0 -g' to provide symbolic debug information and predictable TotalView behavior with the Intel  compilers.
  2. Add +totalview to your .soft file in your HOME directory and issue the resoft command. This will add TotalView to your environment.
  3. Make sure you have your DISPLAY environment variable set correctly. See the discussion on Using the X11 Windows System and/or Running from an interactive batch session

Serial Code Debugging

If the memory requirements of your code fit within the limits of a shell on a front-end host (one of the honest nodes), you can run TotalView directly. If not, you will need to run on a compute node via an interactive batch session.

From the tg-login front-end host you start the TotalView debugging session with the following command

% totalview ./program.exe [program args]

If you do not see the TotalView process manager window, you should first consult the Using the X Window System page.

MPI Debugging MVAPICH2

First, add +mvapich2-0.9.8p2patched-intel-ofed-1.2-dbg to your ~/.soft file, resoft and relink your application.

Next, start an interactive batch session with the number of nodes/processes needed for debugging your application.

Start up the mpd processes in this session as you would do for a batch job. See the sample batch job file for MVAPICH2.

From another terminal session connected to the PBS launch host:

% mpiexec -tv -machinefile -n XX ./program.exe [program args]

where XX is the number of processes needed. mpiexec should connect to the mpd console of the launch host.

When done debugging, issue the command mpdexitall.

MPI Debugging with Open MPI

First, add +openmpi-1.2.4-intel or +openmpi-1.2.4-gcc to your ~/.soft file, resoft and build your application if you are not already using Open MPI.

NOTE: Version of Open MPI prior to 1.2.4 will not work with TotalView 8.

Next, start an interactive batch session with the number of nodes/processes needed for debugging your application.

From another terminal session connected to the PBS launch host:

% mpirun -tv -np XX -machinefile ${PBS_NODEFILE} ./program.exe [program args]

where XX is the number of processes needed.

For more information, please see the discussion here for more information on using TotalView with Open MPI.

MPI Debugging MPICH-VMI

First, start an interactive batch session with the number of nodes/processes needed for debugging your application.

Once you are ready to debug, MPICH-VMI provides a TotalView switch to the mpirun script that enables the 'attach to a paused process method'. This is different than the other methods of using Totalview with an MPI application but just as valid.

% mpirun -np XX -machinefile $PBS_NODEFILE -debugger totalview ./program.exe [program args]

where XX is the number of processes needed. VMI with then report that the process is waiting for 300 seconds and also provide the host and process PID of the application that totalview should attach to.
Connect to TotalView on Host: 10.1.68.172 PID: 12673. Waiting for 300 Seconds
for example.
In another window, ssh -X (with tunnelling) to the Host ip that reported by VMI. Next, change directory to the location where you ran your application. Finally start totalview with the PID and program name:

% totalview -pid PID ./program.exe

and totalview will start-up, attach the process given by PID and use ./program.exe to get the symbol information from.

You should now be able to use TotalView.

Running from a PBS Interactive batch session

For an interactive batch session you need to specify the number of compute nodes and the amount of wall clock time you will need them. The example below asks for 2 compute nodes with 2 processes per node for 30 minutes.

% qsub -I -V -lwalltime=00:30:00 -lnodes=2:ppn=2
When the session begins, it will startup a shell on launch node.

There are two options once you have a session started: using X11 tunneling with ssh or setting environment variables. The preferred way is to use X11 tunnelling with ssh but the PBS batch system does not use ssh to put the user on the compute node.

Setting Environment Variables directly is described in the Using the X Window System page. In this mode you need to set your DISPLAY variable to the X display of your local machine.

The other option is to use X11 tunneling with ssh. Find the name of the launch host for the interactive batch session above by typing hostname in the window that has the interactive batch session in progress.

From another Abe login session (on a honest node), logon to the launch host via ssh with tunnelling enabled.

% ssh -X launch_host

If the MPD ring has been started in the PBS session, you can start your mpiexec session as above.

3. TotalView on NCSA Shared Memory Machines

TotalView is available on NCSA's shared memory machines.

3.1 Copper (IBM p690)

There are 4 TotalView licenses for jobs up to 32 processes. We do not currently have a way to guarantee you will get a license when your job starts if you run in batch.

Before you begin

    1. Compile your code with  -g -qfullpath compiler flags to enable symbolic debugging. If you have unnamed Fortran COMMON blocks, you will need to use optimization level 0, -0, to obtain varaible information within the COMMON block.
    2. Add TotalView to your environment by either:
      (csh/tcsh): source /usr/local/apps/tools/toolworks/tvvars.csh
      (sh/ksh)  : . /usr/local/apps/tools/toolworks/tvvars.sh
    3. If the memory, CPU time and process count requirements of your code fit within the shell limits on the front-end host (Cu12), you can run TotalView directly. If not, you will need to run via an interactive loadleveler session.
    4. Make sure you have your DISPLAY environment variable set correctly. See the discussion on Using the X11 Windows System.
Serial and OpenMP Debugging
% totalview ./program.exe [ program args ]

MPI Debugging with POE

% totalview poe -a ./program.exe [ program args ] [ PE_arguments ]
For example, here is how you would run a code called xhpl with 4 processors:
% totalview poe -a ./xhpl [ arguments ] -procs 4

Running from a Loadleveler Interactive batch session

Under construction

3.2 Cobalt (SGI Altix)

There are 32 TotalView licenses for jobs up to 32 processes. We do not currently have a way to guarantee you will get a license when your job starts if you run in batch.

Before you begin
  1. Compile and link your code with the compiler/linker flags '-O0 -g' to provide symbolic debug information and predictable TotalView behavior with the Intel 8.X compilers.
  2. Add +totalview to your .soft file in your HOME directory and issue the resoft command. This will add TotalView to your environment.
Serial and OpenMP Debugging

If the memory requirements of your code fit within the limits of a shell on the front-end machine (cobalt), you can run TotalView directly. If not, you will need to run via an interactive batch session.

On the interactive host co-login1, you start the TotalView debugging session with the following command
% totalview ./program.exe [ program args ]
MPI Debugging with MPT

% totalview /usr/bin/mpirun -a [ mpirun arguments ] ./program.exe  [ program args ] 
For example, here is how you would run a code called xhpl with 4 processors:
% totalview /usr/bin/mpirun -a -np 4 ./xhpl

Running from a PBS Interactive batch session

For an interactive batch session you need to specify the number of cpus, the wall clock time and memory you will need. The example below asks for 4 cpus for 30 minutes and 2gb of memory:

% qsub -I -V -lwalltime=00:30:00 -lncpus=4 -lmem=2gb
When the session begins, it will startup a shell on launch node.

There are two options once you have a session started: using X11 tunneling with ssh or setting environment variables. The preferred way is to use X11 tunnelling with ssh but the PBS batch system does not use ssh to put the user on the compute node.

Setting Environment Variables. This is described in the Using the X Window System page. In this mode you need to set your DISPLAY variable to the X display of your local machine.

X11 tunneling. If you specify the debug queue via the -qdebug option to qsub, your interactive batch job will be run on the login host. Since the DISPLAY variable is set correctly by specifiying -V, the job is ready for TotalView debugging.

4. General TotalView usage

Serial and OpenMP Debugging

As TotalView starts up, you will see two windows appear: the Control window and the Process window. In the Process window you can start inserting breakpoints etc and then click on the GO button. Happy debugging.

MPI Debugging

As TotalView starts up, you will see two windows appear: the Control window and the Process window. In the Process window click on the GO button and the when prompted by the window "Process XXX is a parallel job. Do you want to stop the job now ?", click "Yes". You will arrive at the MPI_Init() breakpoint as shown here for a code using SGI's MPT. You are now ready to debug in parallel.

Note
: If you are debugging a code using MPICH-GM on Tungsten you will want to insert a breakpoint at somepoint after the call to MPI_Init() as the builtin breakpoint for MPI_Init() does not appear fully functional.

Some comments from Etnus about breakpoints and MPI_Init:
"Be very cautious in placing breakpoints at or before a line that calls MPI_Init() or MPL_Init() because timeouts can occur while your program is being initialized. After you allow the parallel processes to proceed into the MPI_Init() or MPL_Init() call, allow all of the parallel processes to proceed through it within a short time."

"Timeouts can occur if you place breakpoints that stop other processes too soon after calling MPI_Init() or MPL_Init(). If you create "stop all" breakpoints, the first process that gets to the breakpoint stops all the other parallel processes that have not yet arrived at the breakpoint. This can cause a timeout."

More on Breakpoints

To get all processes to stop at the same action point (see breakpoint) instead of stopping the group of processes as a whole when the current process hits the action point: go to File -> Preferences -> Action Points and select "When breakpoint hit, stop:  Process" rather than Group. You can also set this preference on an individual basis by opening the properties dialog for each individual breakpoint (right click on action point and select Properties).

5. Using the command line interface (CLI)

Using the TotalView command line interface with SGI MPT applications

Put TotalView in your environment:

soft add +totalview

Launch TotalView using the CLI

totalviewcli /usr/bin/mpirun -a -np 4 ./mpihw

For more information on using the CLI, consult the following Etnus pages

6. Enabling Memory Debugging

For all platforms, be sure to add +totalview to your ${HOME}/.soft file and issue the resoft command. Add the following  additions to your linking step and then see that last paragraph for how to check that Memory Debugging is enabled.

Copper (rs6000)

First try
setenv LIBPATH ${TOTALVIEW_HOME}/rs6000/lib/tvheap_mr:${TOTALVIEW_HOME}/rs6000/lib:${LIBPATH}
and then launch TotalView as usual. If the above does not work, you need to relink your application as follows:

32-bit compiling (-q32):

–L ${TOTALVIEW_HOME}/rs6000/lib/tvheap_mr –L${TOTALVIEW_HOME}/rs6000/lib ${TOTALVIEW_HOME}/rs6000/lib/aix_malloctype.o

64-bit compiling (-q64):

–L ${TOTALVIEW_HOME}/rs6000/lib/tvheap_mr –L ${TOTALVIEW_HOME}/rs6000/lib ${TOTALVIEW_HOME}/rs6000/lib/aix_malloctype64_5.o

If you tire of seeing the TotalView reminder about using the Memory Debugger, add the following to ${HOME}/.tvdrc: dset TV::MEMDEBUG::hia_allow_ibm_poe equal true

Tungsten (linux-x86)

Relinking is recommended:
-L${TOTALVIEW_HOME}/linux-x86/lib -ltvheap -Wl,-rpath,${TOTALVIEW_HOME}/linux-x86/lib

Mercury (linux-ia64)

If you are using MPICH-GM, you need to build and link against a version of MPICH-GM built with disable-register for the ch_gm driver by adding: +mpich-gm-1.2.6..14b-intel90-tvdebug to your $HOME/.soft and resoft. Use the MPI compiler utilities mpicc, mpif77, for convenience.

Relinking is recommened:

-L${TOTALVIEW_HOME}/linux-ia64/lib -ltvheap -Wl,-rpath,${TOTALVIEW_HOME}/linux-ia64/lib

Cobalt (linux-ia64)

Relinking is recommened:

-L${TOTALVIEW_HOME}/linux-ia64/lib -ltvheap -Wl,-rpath,${TOTALVIEW_HOME}/linux-ia64/lib

Making sure Memory Debugging is enabled

After launching TotalView as discussed above for each platform, but before running the application within TotalView (before clicking on Go), check if the Memory Debugger is enabled by going to Tools>Memory Debugging and clicking the radio button labeled 'Enable memory debugging' on the Configuration Tab if it is not already selected. Click on the main TotalView window and click on Go, or insert some break points at areas you want to inspect the memory usage.

For more  information on using the TotalView debugger click here.