Yellowstone Tutorial

Computing at NCAR

Using Yellowstone

LTR

Yellowstone Performance

Here are some rough performance numbers to anticipate. Note that these are rough estimates for standard solar wind input. The LFM uses a variable timestep and your results may vary, especially for high speed flows.

Resolution

Grid

Core count

Performance

Single Resolution

53x24x32

8

1.33 core hours per simulated hour

Double Resolution

53x48x64

24

16 core hours per simulated hour

Quad Resolution

106x96x128

144

576 core hours per simulated hour

How do I debug my code with TotalView?

There are a few simple steps to run your code with the TotalView debugger:

  1. Load debugging modules
    module load debug totalview
    
  2. Compile your code with debugging flags enabled. For the LFM, edit env/Make.yellowstone and set
    OPTLVL = -g -traceback -debug full
    TRAP  =  -fp-stack-check -fstack-security-check -ftrapuv
    
  3. Edit job run script, adding the following three lines to the LSF/BSUB settings near the top:
    #BSUB -XF   # X11 forwarding
    #BSUB -Ip   # interactive job
    #BSUB -a tv # select the tv elim
    
  4. Now submit your job script via bsub

Here's a complete sample job script to run one binary (LFM) with TotalView:

#!/bin/sh
#BSUB -J totalview
#BSUB -o totalview.%j.output
#BSUB -e totalview.%j.error
#BSUB -XF   # X11 forwarding
#BSUB -Ip   # interactive job
#BSUB -a tv # select the tv elim
#BSUB -n 24
#BSUB -R "span[ptile=16]"
#BSUB -W 01:00
#BSUB -q small
#BSUB -P xxxxxxxxx
#BSUB -R "select[scratch_ok > 0]"

# Setup
#source /glade/u/home/schmitt/opt-intel-12.1.4/InterComm-2.0/lib/build.env
#export LD_LIBRARY_PATH=/glade/u/home/schmitt/opt-intel-12.1.4/overture/lib:${LD_LIBRARY_PATH}
ln -sf INPUT1-001.xml INPUT1.xml

# Executable to run TotalView with
mpirun.lsf ./LFM  < /dev/null > totalview.out 2>&1

Troubleshooting

If you get an error like

Job <876021> is submitted to queue <small>.
<<ssh X11 forwarding job>>
<<Waiting for dispatch ...>>
Warning: Permanently added '10.12.2.17' (RSA) to the list of known hosts.
lyon@10.12.2.17's password:

then you need to reset your SSH keys. CISL wrote a script to reset your keys, execute this:

/glade/u/home/siliu/bin/ssh-auth.bash

Resources

Presentation slides

Useful Links