Refactor the configure script
Purpose
Simplify the configure script by moving the generic functionality of managing a configuration to a separate module.
Responsibilities of configure
- Check, as much as possible, that a valid model configuration has been specified by the user. For example:
- Check for valid components and valid combinations of components.
- Check for valid grid resolutions.
- Check, as much as possible, that valid system parameters have been specified by the user. For example:
- Check for valid Fortran90 compiler.
- Check for valid external library locations. Simple checks are just that the library exists. More complete test runs a small code to check that the library can be linked to and initialized.
- Write Makefile (and Filepath) for building a specific model version.
- Write cache file containing description of the configuration that can be used by other utilities that may need this information. In particular the cache file is used by the utility that creates namelists.
Responsibilities of a generic configuration manager
- Read configuration definition from an external file.
- Read configuration default values from an external file.
- Provide query methods to determin whether a name is a valid parameter, and whether a parameter value is valid.
- Provide accessor methods for getting and setting parameter values. The accessor methods check for valid parameter names and values.
- Write a configuration to an output file.
Design overview
- The configure script is written in Perl. Perl is available on just about every platform. Following "Best Practices" coding standards helps to make the code easier to understand/modify by non-experts.
- Perl has powerful commandline parsing. Good for a tool which makes the commandline the primary user interface.
- Should minimize reliance on the user's environment. Only use environment variables for system dependent information:
- INC_NETCDF, LIB_NETCDF, MOD_NETCDF
- LAPACK_LIBDIR
- INC_MPI, LIB_MPI
- ESMF_LIBDIR
- The auxilliary files are XML. Using a standard format means we don't need to write the parsers. Perl XML parsers aren't bundled with many perl distributions, so we currently rely on a lightweight parser that's written in Perl and which we distribute with our build tools.
- The configuration is defined in the file config_definition.xml. A few entries from CAM's file:
<entry id="cam_bld" value="."> CAM build directory; contains .o and .mod files. </entry> <entry id="dyn" valid_values="eul,sld,fv,homme" value=""> Dynamics package: eul, sld, fv, or homme. </entry> <entry id="prog_aero" valid_values="dust,seasalt,caer4,caer16,sulfur" value="" list="1"> Prognostic aerosol package: list of any subset of the following: dust,seasalt,caer4,caer16,sulfur </entry>
- The configuration parameters are the values of the "id" attribute of each entry. A valid XML file requires all "id" attributes to be unique. The set of "id" values comprises the valid configuration parameter names.
- When a list of valid values for a parameter are known, they can be specified in the "valid_values" attribute. This allows the generic configuration manager to check for valid values.
- A set of defaults for a particular configuration are specified in a config_defaults_XXX_.xml file. The file used by the configure script to set defaults is determined by commandline options. Here is the defaults file used for tropospheric MOZART chemistry:
<entry id="dyn" value="fv" /> <entry id="hgrid" value="10x15" /> <entry id="chem" value="trop_mozart" /> <entry id="prog_aero" value="dust,seasalt" /> <entry id="nadv" value="98" /> <entry id="cppdefs" value="-DTROPCHEM" />
- The XML format is the same as for the definition file, but only the id and value attributes need be present.
- The values in the defaults file can be overridden by commandline options.
- The horizonal grids are defined in config_horiz_grid.xml. A few entries are:
<horiz_grid dyn="eul" hgrid="64x128" nlat="64" nlon="128" m="42" n="42" k="42" /> <horiz_grid dyn="sld" hgrid="64x128" nlat="64" nlon="128" m="63" n="63" k="63" /> <horiz_grid dyn="fv" hgrid="1.9x2.5" nlat="96" nlon="144" /> <horiz_grid dyn="homme" hgrid="ne7np8" csne="7" csnp="8" />
- Different grid types use different parameters to describe them.
- The configure script allows the user to specify only the dycore (dyn) and the grid specifier (hgrid). The details of what parameters are required to define the grid live in the XML file.
- Defining a new grid for an existing dycore requires adding a line to this file.
- Defining a grid for a new dycore requires determining what grid parameters need to be specified, and will require modifications to the code (in the configure script) that reads this file.
- The Makefile is produced by prepending a set of Makefile macros to a template file (Makefile.in) which contains system dependent build information.
- Configure produces a cache file (config_cache.xml) which contains all the configuration parameters and their values. This file has the same format as the definition file, and can be used as a defaults file for a subsequent invocation of configure.
- The system dependent values in a defaults file are ignored by configure, so a cache file (which contains system dependent settings) can be used as a defaults file on a different platform to reproduce a particular model configuration.