URGENT Air Metadata


  1. What are metadata? Why are they essential?

  2. Metadata for URGENT Air Projects

  3. Additional documentation

1. What are metadata? Why are they essential?

The term metadata encompasses all the information necessary to interpret, understand and use a given dataset. Discovery metadata more particularly apply to information (keywords) that can be used to identify and locate the data that meet the user's requirements (via a Web browser, a Web based catalogue, etc). Detailed metadata include the additional information necessary for a user to work with the data without reference back to the data provider. The metadata required by the NASA Ames Standard Format include both discovery and detailed metadata.

Metadata pertaining to observational data, for example, include details about how (with which instrument or technique), when and where the data have been collected, by whom (including affiliation and contact address or telephone number) and in the framework of which research project. In the case of processed data, the nature of the initial raw data and the derivation process must be stated. The nature and units of the recorded variables are of course essential, as well as the grid or the reference system. Metadata pertaining to model output should include the name of the model, the conditions of the calculation, the nature of its output, the geographical domain over which the output is defined (when applicable), etc. Specific conditions applying to the model or the experiment may be mentioned. Metadata also obviously include information on the format in which the data are stored, the order of the variables, etc, to allow potential users to read them. Metadata pertaining to software models include the key points of the theory on which the model is based, the techniques and computational language used, references, etc.

Since the evaluation of information relevance may widely vary with individuals, some metadata standards have been – and are still currently being – developed with the aim of uniformising metadata presentation. The other advantage of metadata standards is that they ensure the transmission of the information contained in the metadata (and hence the ability to use the data), in some predefined generic way, to remote and future users, provided that the latter will know the adopted conventions. Which in turn requires the existence, maintenance and transmission of manuals describing the set of conventions relevant to a particular metadata standard – some kind of metametadata.

Since a crucial section of the metadata pertains to the data format, different metadata standards have been developed in conjunction with the various data formats. Existing data format standards, and metadata standards alike, are based both on the specific needs of confined scientific communities and on habits already in use within these communities. All of them regularly undergo updates and are susceptible of further evolution. In geosciences and among disciplines where 2-dimensional Earth surface reference systems play an important role (like archæology), the most popular data formats seem to belong to the GIS family (Geographic Information Systems). In the atmospheric research community, however, the NASA Ames Format for Data Exchange has been widely adopted since it is both simpler and more generic (in particular, it is better adapted to 3-D and 4-D fields). In addition, it makes provision of simplified cases fitted to data collected onboard balloons or aircrafts. It also includes rules applying to the formulation of metadata.

The metametadata bible for NASA Ames format is Gaines and Hipskind, 1998. An online summary of this manual is available from the BADC.

2. Metadata for URGENT Air Projects

2.1 Metadata for tables of numbers (observations or model output)

2.1.1 Content

Metadata include the following overall information. Some information in this list may be applicable in specific cases only. For accurate instructions about metadata formulation, see
Section 2.1.3 (Format) below.

2.1.2 Storage

Each data file includes a header containing the metadata.

2.1.3 Format

Fields of numbers and their associated metadata are supplied in NASA Ames format. The exact content and format of the metadata pertaining to NASA Ames formatted data depend on the File Format Index (FFI) adopted. For details, please refer to the FFI summary and to the specific requirements for URGENT data files.

2.2 Metadata for photographic images (JPEG, GIF, ...)

2.2.1 Content

Metadata of photographic images of the Earth surface include the following. For accurate instructions about metadata formulation, see Section 2.2.3 (Format) below.

2.2.2 Storage

Metadata are stored in separate text files – one metadata file per image file. The metadata file has the same name as the image file except for the file extension, which is .txt (see URGENT File Names).

2.2.3 Format

Please refer to Section 4.2 Images of the "URGENT Data Files: Structure and Format" document.

2.3 Metadata for software

2.3.1 Content

Metadata pertaining to a model should include the following.

2.3.2 Storage

Metadata relative to software can be included as comments in the top section of the source file or can alternatively be provided as a separate text file.

2.3.3 Format

Text. There is no particular requirement regarding software metadata formatting.

3. Additional documentation

Any additional documentation on recorded data or images, whether pertaining to a single data file or a whole dataset, that would not find its place into the structures described above (because it does not fall into any described category or because it is too voluminous) may be submitted to BADC in the form of a text file that will be stored in the URGENT archive documentation directory. These documents may for example include technique description, possible use of the data, study conclusions, etc.