Data Analysis Services Group - February 2012

News and Accomplishments

VAPOR Project

Project information is available at: http://www.vapor.ucar.edu

TG GIG PY6 Award:

After meeting with PIO developers Yannick revised the PIO VDC extensions to follow the coding style of PIO. He also re-factored the VDC implementation so that it is accessed entirely through the common PIO API (previously using the VDC driver required calling VDC specific extensions). Finally, he added support for generating the .vdf file, required by VDC, directly from PIO. Hence, a .vdf no longer needs to be created as a pre-processing step prior to using PIO.

The build system for PIO was generalized to be more platform agnostic, and as a first test case the PIOVDC distribution was successfully built on CU's Janus cluster.

Wes started the month by familiarizing himself with the VAPOR project. This included reading the documentation, installing the newest version, following an example (Hurricane Katrina). This resulted in a few bug report submissions. Most of them dealt with trying to load files at various stages and having the system crash.

Wes then spent time getting set up on the supercomputers. This time was spent reading documentation for Bluefire, Lynx, Janus, and Storm as well as playing around with simple programs and the batch submission process. Next he experimented with some of Yannick's code that makes use of some PnetCDF calls to write a block of data out to file. The distribution of the data across processors wasn't working properly so I just set up a simple block distribution in the Z direction. This code already had timing code included but he extended that to include the closing calls for PnetCDF.

After gathering timing data from that PnetCDF program we were looking for something to compare it to. It was suggested that we look into using the IOR HPC Benchmark. After a while of reading documentation, conversations with Rory and running simple tests Wes started collecting timing data for IOR immediately following runs of the PnetCDF code. The IOR was initially run with POSIX, POSIX in collective write mode, and MPIIO. The PnetCDF code was run with 32 processes on a problem size of 2048x2048x1024 writing floats. The numbers returned on Bluefire were not ideal. There was a lot of variation for the results produced by IOR, ranging anywhere from 350MiB/sec aggregate write time to 990MiB/sec. The results for the PnetCDF code were much lower ranging from around 280MiB/sec to 430MiB/sec aggregate write time.

These results are not ideal but looking at a paper 'Parallel IO performance and scalability study on the PRACE CURIE supercomputer' there is hope to improve by the correct tweaking of the system/code. They ran IOR with 10 different MPI-IO configurations on 256 MPI processes/cores reading/writing a block of 1GiB and transfer size of 2MiB. They looked at POSIX, HDF5, MPI-IO, and PnetCDF. In general PnetCDF was about 100MiB/sec lower than MPI-IO and about 100MiB/sec better than HDF5. The PnetCDF numbers were around that 270MiB/sec range. There were three configurations that were 800 and up with one of them being the overall best write across the board at 1328MiB/sec. They were running the romio_cb and romio_ds MPI-IO hints set to disable.
We also looked at altering the header size and the variable file layout alignment. These are set by providing the nc_header_align_size and nc_var_align_size hints. Wes played around with several different values for these hints but nothing seemed to make any difference on Bluefire.

Wes then looked into getting Tau to work so he could instrument the benchmark code. We talked with Davide about the different performance tool options already on the systems and in general. Wes found a script that was written to do all of the IO data collection for you, tau_exec. It was only introduced a few releases ago (2.20?) but the version on Bluefire is 2.18 and so he spent some time trying to get the new version set up. So far it has not panned out.

Wes is also getting set to run on the other systems. He has made initial runs on Janus for just the PnetCDF code . The data for both the PnetCDF test and IOR are significantly slower than on Bluefire but about the same when compared to each other. The few runs made thus far have all been around the 200MiB/sec mark. Again playing with the header and strip size made no improvement. Wes is not not sure why this isn't helping, looking through talks on Lustre, and parallel IO in general, everything suggests better performance by manually modifying these sizes. In some cases it is taking more than 5 minutes to write out a 2048*2048*1024 element file filled with floats using PnetCDF.

John provided Jim Edwards with a few power point slides for a talk he was giving on PIO.

John, Wes, and Yannick met with IMAGE's Mininni and Rosenburg to discuss integrating PIOVDC into Mininni's Ghost code. The IMAGe team is planning a 6k^3 run, which would be an excellent test for PIOVDC scalability.

Development:

Ported VAPOR to 64-bit Windows. The main problem here was in finding compatible 64-bit versions of the libraries that VAPOR requires. Not all libraries were available for the same Windows runtime, however we were able to build a 64-bit version of Qt 4.7.4 in Visual Studio 2008 so that it works with the only currently available version of 64-bit NetCDF libraries.

The team continues to re-factor vaporgui to use the new RegularGrid class, and its derivatives, as the underlying internal vaporgui data model:

John fixed the spherical data volume renderer, which was broken by the switch to the RegularGrid class
The LayeredGrid class was tuned to improve interpolation performance. A ~20 speedup was achieved.
The "Barb" (hedgehog plot" visualizer was migrated from the old to the new data model.
John began implemented a StretchedGrid class, derived from RegularGrid, that would allow vaporgui to support stretched computational grids.

John worked with Sam Geen, a PhD student at Oxford who is building a RAMSES to VDC data converter. RAMSES is a cosmological code that uses adaptive mesh refinement. The data converter is proving challenging as RAMSES produces exceptionally deep refinements (up to 20 or 30 levels).

The team completed the first phase of migrating VAPOR documentation to drupal. This was a badly needed overhaul that consolidated and organized numerous user documents produced over the years. The results should be a significant improvement for VAPOR users.

Administrative:

At the suggestion of the front office, several popular 3D visualization applications were analyzed and compared with VAPOR as alternatives for supporting the needs of the NCAR science communities. Results strongly indicate VAPOR has significant advantages over VisIt and Paraview when visualizing massive data, and in ease of use when visualizing earth-science data. The results were summarized in report provided to CISL Council.

Five students expressed an interest in our 2012 SIParcs project, Two of them were selected because they appear to be good candidates for the position, working on VAPOR animation control.

Education and Outreach:

John and Alan met with Young-Jean Choy, the director of Applied Meteorological Research at Korea's National Institute of Meteorological Research, to discuss current and future plans for VAPOR.

John participated in a conference call with U . of Wy. partners to help plan the University's new undergraduate visualization facilities.

Publications, Papers & Presentations

John and Alan co-authored an invited chapter on VAPOR for the book High Performance Visualization, (Bethel, Childs, Hansen, editors).

Systems Projects

Future Storage Research Projects

Prepared storage testbed for Openstack research. Intalled the Python-2.7.2 and rsync-3.0.9 in preparation for the cloud storage setup. Added packages for Swift (the main program for Openstack cloud engine)
Added xfs packages (xfsprogs, xfsdump, dmapi, attr, and acl) to stratus nodes to enable backend storage for Openstack
Installed memcached on mds node (the proxy-server for the openstack testbed)
Reviewed information about NFS appliances from a number of vendors for potential suitability to host the /glade/home file system.

Security & Administration Projects

Evaluated the self-consistency of the LDAP hosted HPC Unix Account and Group data and its consistency with that currently in use on HPC servers. Forwarded the findings to the involved parties. The group data has the worst inconsistency due to the manual group membership management used currently. The biggest problem for the group membership data is the presence of inactive or historical users due to the lack of any policy and procedure to handle notification of terminated user accounts.
Started reading the NSA "Guide to the Secure Configuration of Red Hat Enterprise Linux 5" to get some ideas about what should be done to secure the new GLADE servers located at NWSC.

NWSC Planning & Installation

Attended multiple interview sessions for the open NWSC SA1 positions.
Reviewed document and presentations for NWSC targeted equipment as they became available.
Started familiarizing ourselves with the Vidyo video conferencing software. Tested the VidyoDesktop client on the OpenSuse and CentOS Linux distributions
without success.
Continuing to plan for installation of resource at the NWSC.

System Support

Data Analysis & Visualization Clusters

Disabled user accounts on mirage3/4 except for DASG, SSG, CSG, and DSS for testing the LSF scheduler.
Installed a PDF "printer" on storm0 to allow a user to save images from an application that can't export images any other way.
Our RedHat licenses were renewed, so began the process of re-entitling systems and running updates.
Installed the latest version of the PGI compilers on all DASG systems.

GLADE Storage Cluster

Performed a GID scan on all glade file systems to provide estimates for the possible conversion from ncar::100 to ncar::1000. Generated a list with the number of files on GLADE owned by each group in our /etc/group file, and got a count of ~34 million files owned by GID 100 (ncar).
Created the COLA1058 project space.
Increased the quota of the /glade/data02/dsstransfer fileset by 10TB, as requested by Chi-Fan.
After noticing some timeout errors and short performance outages on the 9900 storage system, contacted DDN support and followed steps to collect information from the system. During these steps, the system failed all disks on the C channel. After following DDN's next steps with no success, they sent Steve on site who was able to bring the system
back to a healthy state. The failed disks have been fully rebuilt and we are waiting to hear back from DDN if they have any followup to this issue.
Manually failed and replaced drive 17A which has been generating too many block errors.
Replaced a failed disk 35F with a spare 2TB disk. This was an unusual failure since the failed disk was not able to communicate at all. Typical failure mode is command timeouts on controller while the disk being still active.
SCSI I/O error on LUN26 (from disk 26D block errors). We responded with the usual procedure. Due to the timing of the event (around 5:00PM) leading to delayed response, some clients lost communication to /glade/data02 file systems. We are investigating the option of automating the mmnsddiscover routine in addition to the text message alert. Motivation for automation is faster reaction preventing file system unmounts on remote clusters.

Data Transfer Cluster

Still awaiting establishment of a production UCAS authentication based MyProxy server for use with the GLADE GridFTP service. Security reports that this should be available by August 1.
Answered questions about GridFTP and firewalls for CSG, added firewall rules for to support an additional transfer scenario from NCAR to NERSC.
Assisted Chi-fan to test migrating some ECWMF data transfers on a datagate node in preparation for the decommissioning of the mirage cluster.
Installed several user certificates to enable access to the GridFTP servers.

Lynx Cluster

Looked briefly into how the GNU linker and dynamic loader interact in response to a question from Sidd about handling software package library version dependencies on Lynx. Was unable to find a supported method to isolate these packages from the user's environment LD_LIBRARY_PATH variable without requiring generating a patches for the linker and loader.