TotalView Webinars
Debugging Threads with TotalView
3:00 p.m. EDT on Tuesday, September 25, 2007
Introduction to TotalView
12:00 p.m. EDT on Friday, October 05, 2007
The Webinar is ~one hour live interactive web demonstration designed to help people new to TotalView become more comfortable with the features and benefits. We understand that you have specific things you need to do with a debugger; let us help you get started using TotalView.
You can register at: http://www.totalviewtech.com/Try/webinar_enroll.php
Table of Contents
- Overview
- TotalView on NCSA Linux Clusters
- TotalView on NCSA Shared Memory Machines
- General TotalView usage
- Using the command line interface (CLI)
- Enabling Memory Debugging
TotalView is a full-featured, source-level, graphical debugger for C,
C++, and Fortran (77 and 90), assembler, and mixed source/assembler
codes based on the X Window System from
Etnus. TotalView supports MPI, PVM and
HPF.
Information on TotalView is available in the release notes and
user
guide at the Etnus
Online
Documentation page, as well as in the
/usr/apps/tools/toolworks/totalview/doc directory.
Also see "man totalview" for command syntax and options.
Note: In order to use TotalView, you must be using a
terminal or workstation
capable of displaying X Windows. See
Using
the X Window System for more information.
TotalView is available on NCSA's Linux Clusters.
There are 4 TotalView licenses for jobs up to 8 processes, and
1 license for jobs up to 128 processes. We do not currently have a way
to guarantee you will get a license when your job starts if you run in
batch.
GNU and Intel compilers are supported.
Important: For both compilers you need to
compile and link your code with -g
to enable source code listing within TotalView.
Before
you begin
- Compile and link your code with the compiler/linker
flags '-O0 -g' to provide symbolic debug information and
predictable TotalView behavior with the Intel compilers.
- Add +totalview to your .soft file in your
HOME
directory and issue the resoft command. This will add
TotalView to your environment.
- Make sure you have your DISPLAY environment variable set
correctly. See the
discussion on Using
the X11 Windows System and/or Running
from an interactive
batch session.
If the memory and CPU time requirements
of your code
fit within the limits of a
shell on the front-end host (tun[abcde]), you can run TotalView
directly from the host. If not, you will need to run on a compute node
via an
interactive
batch session.
From a Tungsten front-end host you
start the TotalView debugging
session with the following command:
% totalview ./program.exe
[program args]
If you do not see the TotalView control
window, you may need to consult the Using
the X Window System page.
First, start an interactive
batch session with the number of nodes/processes needed for
debugging your application.
Once you are ready to debug, ChaMPIonPro
provides a TotalView switch
to the cmpirun script:
% cmpirun -lsf -tv ./program.exe [program
args]
If you are not running from within the
interactive batch shell:
% cmpirun -np $NPROCS
-machinefile $LSB_NODEFILE -tv
./program.exe [program args]
where $NPROCS and $LSB_NODEFILE are defined in the interactive
batch shell.
First, start an interactive
batch session with the number of nodes/processes needed for
debugging your application.
Once you are ready to debug, MPICH-GM
provides a TotalView switch to
the mpirun script.
% mpirun -np $NPROCS -machinefile
$LSB_NODEFILE -tv ./program.exe [program args]
or you may try
% mpirun.ch_gm DISPLAY=$DISPLAY
-np $NPROCS -machinefile
$LSB_NODEFILE -totalview ./program.exe [program args]
For an interactive batch session you need
to specify the number of compute
nodes and the
amount of wall clock time you will need them. The example below asks
for 2 compute nodes with 2 processes per
node for 30 minutes.
% bsub -Is -n4 -W00:30 $SHELL
When the session begins, it will startup
the shell specified by $SHELL.
There are two options once you have a
session started: using X11
tunneling with ssh or setting environment variables. The preferred way
is to use X11 tunnelling with ssh but the LSF batch system does not use
ssh to put the user on the compute node.
Setting
Environment Variables This
is described in the
Using
the X Window System page. In this mode you need to set your
DISPLAY
variable to the X display of your local machine.
X11
tunneling
with ssh First, determine the name of
the launch host for the interactive batch session above by typing hostname
in the window that has the interactive batch session in progress. You
should also echo the environment variables LSB_NODEFILE
and NPROCS which will be used below.
From another Tungsten login session,
logon to the launch host via
ssh with tunnelling enabled.
% ssh -X launch_host
Once connected to the launch host, change
to the directory from
where you run your application. Next, set the environment variables LSB_NODEFILE
and NPROCS to those reported from the interactive batch
session.
Issue with GPFS and the 2.4 kernel (2008)
An issue with using TotalView to debug MPI applications where the executable resides on a GPFS filesystem
and/or the MPICH-GM libraries are dynamically linked to versions which reside on a GPFS filesystem
appeared after the transition of all filesystems to GPFS in late '07.
Additional steps have been added below to work around this issue by statically linking MPICH-GM libraries
and by placing the application executable and input data on the node local scratch for each node in the
debug session.
Before you begin
- Compile and link your code with the compiler/linker
flags '-O0 -g' to provide symbolic debug information and
predictable TotalView behavior with the Intel compilers. If you
are using MPICH-GM and the mpicc, mpicxx, mpif77 and mpif90 compiler
scripts, you need to set the environment variable
MPICH_USE_SHLIB
to no before linking. This can be done in your .soft
file by adding the line:
MPICH_USE_SHLIB=no
or if you are NOT using the compiler scripts and prefer to add the libraries needed for linking:
f77/f90: -L $MPICH_GM_HOME/lib -lmpich -lpmpich -L $GM_HOME/lib -lgm -lpthread
c: -L $MPICH_GM_HOME/lib -lmpich -lpmpich -L $GM_HOME/lib -lgm -lpthread
c++: -L $MPICH_GM_HOME/lib -lpmpich++ -lmpich -lpmpich -L $GM_HOME/lib -lgm -lpthread
Doing ldd executablename should not show any MPICH-GM libraries (GM libs are ok).
- Add +totalview to your .soft file in your
HOME
directory and issue the resoft command. This will add
TotalView to your environment.
- Make sure you have your DISPLAY environment variable set
correctly. See the
discussion on Using
the X11 Windows System and/or Running
from an interactive
batch session.
If the memory requirements of your code
fit within the limits of a
shell on the front-end host (tg-login), you can run TotalView
directly. If not, you will need to run on a compute node via an
interactive
batch session.
From the tg-login front-end host you
start the TotalView debugging
session with the following command
% totalview ./program.exe
[program args]
If you do not see the TotalView process
manager window, you should
first consult the Using
the X Window System page.
First, start an interactive
batch session with the number of nodes/processes needed for
debugging your application. This will put you onto the launch host for the
job.
As mentioned above, your application executable needs to be placed on a non-GPFS
filesystem which in this case is the local scratch disk on each node (/scr). To efficiently
copy the executable (and any other input files needed by the executable), use
pbsdsh as follows from within the launch host of the interactive job:
pbsdsh -u cp $HOME/path/to/my/executablec /scr/$PBS_JOBID
pbsdsh -u cp $HOME/path/to/my/inputdata /scr/$PBS_JOBID
cd /scr/$PBS_JOBID
If you are not using ssh X11 tunnelling but are setting DISPLAY to be a direct access to your local Xserver
then you can skip down to mpirun -np XX ...., otherwise do the following steps.
cp $PBS_JOBID machinefile
hostname
and from another login session on tg-login.ncsa.teragrid.org:
ssh -X launch hostname
cd /scr/pbs jobid from above
and then start mpirun as below with $PBS_NODEFILE replaced with machinefile
as needed.
Once you are ready to debug, MPICH-GM
provides a TotalView switch to
the mpirun script.
% mpirun -np XX -machinefile
$PBS_NODEFILE -tv ./program.exe [program args]
where XX is the number
of processes needed.
First, start an interactive
batch session with the number of nodes/processes needed for
debugging your application.
Follow the steps in the MPICH-GM above description to copy your executable and
input files to local scratch (/scr) and then proceed with the following steps.
Once you are ready to debug, MPICH-VMI
provides a TotalView switch to
the mpirun script that enables the 'attach to a paused process method'.
This is different than
the other methods of using Totalview with an MPI application but just
as valid.
% mpirun -np XX -machinefile
$PBS_NODEFILE -debugger totalview ./program.exe [program args]
where XX is the number
of processes needed. mpirun with then report that the process is
waiting for 300 seconds and also provide the process PID of the
application that totalview should attach to.
In another window, ssh (with -X) to the host machine that your ran
mpirun on. Then change directory
to the location where you ran your application. Finally:
% setenv LD_LIBRARY_PATH /opt/vmi-2.0.1-1-gcc/lib:$LD_LIBRARY_PATH
% totalview -pid PID ./program.exe
and totalview will start-up, attach the process given by PID and use
./program.exe to get the symbol information from.
For an interactive batch session you need
to specify the number of compute
nodes and the
amount of wall clock time you will need them. The example below asks
for 2 compute nodes with 2 processes per
node for 30 minutes.
% qsub -I -V -lwalltime=00:30:00 -lnodes=2:ppn=2
When the session begins, it will startup a shell
on launch node.
There are two options once you have a
session started: using X11
tunneling with ssh or setting environment variables. The preferred way
is to use X11 tunnelling with ssh but the PBS batch system does not use
ssh to put the user on the compute node.
Setting Environment Variables directly
is described in the
Using
the X Window System page. In this mode you need to set your
DISPLAY
variable to the X display of your local machine.
The other option is to use X11 tunneling
with ssh. Find the name of
the launch host for the interactive batch session above by typing hostname
in the window that has the interactive batch session in progress. You
should also echo the environment variable $PBS_NODEFILE
which will be used below.
From another TG-NCSA login session,
logon to the launch host via
ssh with tunnelling enabled.
% ssh -X launch_host
Once connected to the launch host, change
to the directory from
where you run your application. Next, set the environment variable
PBS_NODEFILE
to that reported from the interactive batch
session.
TotalView will be supported on Abe. This support is currently
in-progress.
Before
you begin
- Compile and link your code with the compiler/linker
flags '-O0 -g' to provide symbolic debug information and
predictable TotalView behavior with the Intel compilers.
- Add +totalview to your .soft file in your
HOME
directory and issue the resoft command. This will add
TotalView to your environment.
- Make sure you have your DISPLAY environment variable set
correctly. See the
discussion on Using
the X11 Windows System and/or Running
from an interactive
batch session.
If the memory requirements of your code
fit within the limits of a
shell on a front-end host (one of the honest nodes), you can run TotalView
directly. If not, you will need to run on a compute node via an
interactive
batch session.
From the tg-login front-end host you
start the TotalView debugging
session with the following command
% totalview ./program.exe
[program args]
If you do not see the TotalView process
manager window, you should
first consult the Using
the X Window System page.
First, add +mvapich2-0.9.8p2patched-intel-ofed-1.2-dbg
to your ~/.soft file, resoft and relink
your application.
Next, start an interactive
batch session with the number of nodes/processes needed for
debugging your application.
Start up the mpd processes in
this session as you would do for a batch job. See the sample batch job
file for MVAPICH2.
From another terminal session connected
to the PBS launch host:
% mpiexec -tv -machinefile -n XX
./program.exe
[program args]
where XX is the number
of processes needed. mpiexec should connect to the mpd console of the
launch host.
When done debugging, issue the command mpdexitall.
First, add +openmpi-1.2.4-intel or +openmpi-1.2.4-gcc
to your ~/.soft file, resoft and build
your application if you are not already using Open MPI.
NOTE: Version of Open MPI prior to 1.2.4 will not work with TotalView 8.
Next, start an interactive
batch session with the number of nodes/processes needed for
debugging your application.
From another terminal session connected
to the PBS launch host:
% mpirun -tv -np XX -machinefile ${PBS_NODEFILE}
./program.exe
[program args]
where XX is the number
of processes needed.
For more information, please see the discussion here for more information on using TotalView with Open MPI.
First, start an interactive
batch session with the number of nodes/processes needed for
debugging your application.
Once you are ready to debug, MPICH-VMI
provides a TotalView switch to
the mpirun script that enables the 'attach to a paused process method'.
This is different than
the other methods of using Totalview with an MPI application but just
as valid.
% mpirun -np XX -machinefile
$PBS_NODEFILE -debugger totalview ./program.exe [program args]
where XX is the number
of processes needed. VMI with then report that the process is
waiting for 300 seconds and also provide the host and process PID of the
application that totalview should attach to.
Connect to TotalView on Host: 10.1.68.172 PID: 12673. Waiting for 300 Seconds
for example.
In another window, ssh -X (with tunnelling) to the Host ip that reported by VMI.
Next, change directory to the location where you ran your application. Finally
start totalview with the PID and program name:
% totalview -pid PID ./program.exe
and totalview will start-up, attach the process given by PID and use
./program.exe to get the symbol information from.
You should now be able to use TotalView.
For an interactive batch session you need
to specify the number of compute
nodes and the
amount of wall clock time you will need them. The example below asks
for 2 compute nodes with 2 processes per
node for 30 minutes.
% qsub -I -V -lwalltime=00:30:00 -lnodes=2:ppn=2
When the session begins, it will startup a shell
on launch node.
There are two options once you have a
session started: using X11
tunneling with ssh or setting environment variables. The preferred way
is to use X11 tunnelling with ssh but the PBS batch system does not use
ssh to put the user on the compute node.
Setting Environment Variables directly
is described in the
Using
the X Window System page. In this mode you need to set your
DISPLAY
variable to the X display of your local machine.
The other option is to use X11 tunneling
with ssh. Find the name of
the launch host for the interactive batch session above by typing hostname
in the window that has the interactive batch session in progress.
From another Abe login session (on a
honest node),
logon to the launch host via
ssh with tunnelling enabled.
% ssh -X launch_host
If the MPD ring has been started in the PBS session, you can start your
mpiexec session as above.
TotalView is available on NCSA's shared memory
machines.
There are 4 TotalView licenses for jobs up to 32 processes. We do not
currently have a way
to guarantee you will get a license when your job starts if you run in
batch.
Before
you begin
- Compile your code with
-g -qfullpath
compiler
flags to enable symbolic debugging. If you have unnamed Fortran COMMON
blocks,
you will need to use optimization level 0, -0,
to obtain varaible information within the COMMON block.
- Add TotalView to your environment by either:
(csh/tcsh): source /usr/local/apps/tools/toolworks/tvvars.csh
(sh/ksh) : . /usr/local/apps/tools/toolworks/tvvars.sh
- If the memory, CPU time and process count requirements of your
code
fit within the shell limits on the front-end host (Cu12), you can run
TotalView
directly. If not, you will need to run via an
interactive loadleveler session.
- Make sure you have your DISPLAY environment variable set
correctly. See the
discussion on Using
the X11 Windows System.
Serial
and OpenMP Debugging
% totalview ./program.exe [ program args ]
MPI
Debugging with POE
% totalview poe -a ./program.exe [ program args ] [ PE_arguments ]
For example, here is how you would run
a code called xhpl with 4 processors:
% totalview poe -a ./xhpl [ arguments ] -procs 4
Under construction
There are 32 TotalView licenses for jobs up to 32 processes. We do
not currently have a way to guarantee you will get a license when your
job starts if you run in batch.
Before
you begin
- Compile and link your code with the compiler/linker
flags '-O0 -g' to provide symbolic debug information and
predictable TotalView behavior with the Intel 8.X compilers.
- Add +totalview to your .soft file in your
HOME
directory and issue the resoft command. This will add
TotalView to your environment.
Serial
and OpenMP Debugging
If the memory requirements of your code
fit within the limits of a
shell on the front-end machine (cobalt), you can run TotalView
directly. If not, you will need to run via an
interactive
batch session.
On the interactive host co-login1, you
start the TotalView debugging
session with the following command
% totalview ./program.exe [ program args ]
MPI
Debugging with MPT
% totalview /usr/bin/mpirun -a [ mpirun arguments ] ./program.exe [ program args ]
For example, here is how you would run
a code called xhpl with 4 processors:
% totalview /usr/bin/mpirun -a -np 4 ./xhpl
For an interactive batch session you need
to specify the number of cpus, the wall clock time and memory you will
need. The example below asks
for 4 cpus for 30 minutes and 2gb of memory:
% qsub -I -V -lwalltime=00:30:00 -lncpus=4 -lmem=2gb
When the session begins, it will startup a shell
on launch node.
There are two options once you have a
session started: using X11
tunneling with ssh or setting environment variables. The preferred way
is to use X11 tunnelling with ssh but the PBS batch system does not use
ssh to put the user on the compute node.
Setting
Environment Variables. This
is described in the
Using
the X Window System page. In this mode you need to set your
DISPLAY
variable to the X display of your local machine.
X11 tunneling. If you
specify the debug queue via
the -qdebug option to
qsub, your interactive batch job will be run on the login host. Since
the DISPLAY variable is
set correctly by specifiying -V,
the job is ready for TotalView debugging.
Serial
and OpenMP Debugging
As TotalView starts up, you will see two
windows appear: the Control
window and the Process
window. In
the Process window you can start inserting breakpoints etc and then
click on the GO
button. Happy debugging.
MPI Debugging
As TotalView starts up, you will see
two windows appear: the
Control
window and the
Process window.
In
the Process window click on the
GO
button and the
when prompted by the window
"
Process XXX is a parallel job. Do
you want
to stop the job now ?",
click "Yes". You will arrive at the
MPI_Init()
breakpoint as shown
here for a
code using
SGI's MPT. You are now
ready to debug in parallel.
Note: If you are debugging a code using
MPICH-GM on Tungsten you will want
to insert a breakpoint at somepoint after the call to
MPI_Init() as the
builtin breakpoint for
MPI_Init()
does not appear fully functional.
Some comments from Etnus about breakpoints and MPI_Init:
"Be very cautious in placing
breakpoints
at or before a line that calls MPI_Init() or MPL_Init() because timeouts can occur while your
program is being initialized. After you allow the parallel processes to
proceed into the MPI_Init()
or MPL_Init() call,
allow all of the parallel processes to proceed through it within a
short time."
"Timeouts can occur if you place
breakpoints that stop other processes too soon after calling MPI_Init() or MPL_Init().
If you create "stop all" breakpoints, the first process that gets to
the breakpoint stops all the other parallel processes that have not yet
arrived at the breakpoint. This can cause a timeout."
More on Breakpoints
To get all processes to stop at the
same action point (see
breakpoint)
instead of stopping the group of processes as a whole when the current
process hits the action point: go to File ->
Preferences -> Action Points and select "When
breakpoint hit,
stop: Process" rather than Group. You can
also set this preference on an individual basis by opening the
properties dialog for each individual breakpoint (right click on action
point and select Properties).
Using the TotalView command line interface with SGI MPT applications
Put TotalView in your environment:
soft
add +totalview
Launch TotalView using the CLI
totalviewcli
/usr/bin/mpirun -a -np 4 ./mpihw
For more information on using the CLI, consult the following Etnus
pages
For all platforms, be sure to add +totalview
to your ${HOME}/.soft
file and issue the resoft command.
Add the following additions to your linking step and then see
that last paragraph for how to check that Memory Debugging is enabled.
Copper (rs6000)
First try
setenv LIBPATH
${TOTALVIEW_HOME}/rs6000/lib/tvheap_mr:${TOTALVIEW_HOME}/rs6000/lib:${LIBPATH}
and then launch TotalView as usual. If the above does not work, you
need to relink your application as follows:
32-bit compiling (-q32):
–L ${TOTALVIEW_HOME}/rs6000/lib/tvheap_mr
–L${TOTALVIEW_HOME}/rs6000/lib
${TOTALVIEW_HOME}/rs6000/lib/aix_malloctype.o
64-bit compiling (-q64):
–L ${TOTALVIEW_HOME}/rs6000/lib/tvheap_mr –L
${TOTALVIEW_HOME}/rs6000/lib
${TOTALVIEW_HOME}/rs6000/lib/aix_malloctype64_5.o
If you tire of seeing the TotalView reminder about using the Memory
Debugger, add the following to ${HOME}/.tvdrc:
dset TV::MEMDEBUG::hia_allow_ibm_poe equal true
Tungsten (linux-x86)
Relinking is recommended:
-L${TOTALVIEW_HOME}/linux-x86/lib
-ltvheap -Wl,-rpath,${TOTALVIEW_HOME}/linux-x86/lib
Mercury (linux-ia64)
If you are using MPICH-GM, you need to build and link against a
version of MPICH-GM built with disable-register for the ch_gm driver by
adding:
+mpich-gm-1.2.6..14b-intel90-tvdebug
to your $HOME/.soft and resoft. Use the MPI compiler
utilities mpicc, mpif77, for convenience.
Relinking is recommened:
-L${TOTALVIEW_HOME}/linux-ia64/lib
-ltvheap -Wl,-rpath,${TOTALVIEW_HOME}/linux-ia64/lib
Cobalt (linux-ia64)
Relinking is recommened:
-L${TOTALVIEW_HOME}/linux-ia64/lib
-ltvheap -Wl,-rpath,${TOTALVIEW_HOME}/linux-ia64/lib
Making sure Memory Debugging is enabled
After launching TotalView as discussed above for each platform, but
before running the application within TotalView (before clicking on
Go), check if the Memory Debugger is enabled by going to
Tools>Memory Debugging
and clicking the radio button labeled '
Enable memory debugging' on the
Configuration Tab if it is
not already selected.
Click on the main TotalView window and click on
Go, or insert some break points
at areas you want to inspect the memory usage.
For more information on using the TotalView debugger click
here.