There is a need for metadata to help perform data discovery. There are two approaches I've considered to date:  

  1. metadata at the per variable level saying "this is the one" you are looking for. This would require scanning the list of variables first.
  2. global metadata which points you to the variables.

This is a standard which should be DataSource type independent (i.e. ICARTT, netCDF, HDF5, etc); any file standard which supports global attributes or any level of sophistication in it's header.

There are several reasons for the need for data discovery:

  • Variable names can be cryptic.
  • There can be multiple measurements of the same type.
  • Automation - Software which wants to find its way into a file.

e.g. From an NCAR Aircraft you will have the following Latitudes to choose from:

Var Name

Source

GGLAT

GPS

LAT

IRU

LATC

Blended

CLAT

CMIGITS III

and the following redundant ambient temperature measurements:

  • ATHR1, ATHR2, ATFR

To that end we have defined 2 global metadata attributes for our netCDF files. One for the aircraft position or coordinate variables and a second to identify the wind field variables.

:coordinates = "LONC LATC GALT Time" ;
:wind_field = "WSC WDC WIC" ;

This probably needs some work. For example I should move from space to comma separation. Possibly add a prefix (namespace).

:reference:coordinates = "LONC,LATC,GGALT,Time" ;
:reference:navigation = "PITCH,ROLL,THDG,VEW,VNS,TAS,IAS"
:reference:wind_field = "WSC,WDC,WIC" ; // Or should this be the vector UIC & VIC
:reference:thermodynamic = "PSXC,ATX,DPXC"

Jon Caron and Ethan Davis of Unidata made a couple passes at conventions for observational data including data discovery. 
Unidata Observation Conventions (Draft)

  • No labels