UFAM logo

Inclusion of Metadata in the UFAM archive


Contents

  1. What are metadata? Why are they essential?

  2. Metadata for UFAM Datasets

  3. Additional documentation

1. What are metadata? Why are they essential?

The term metadata encompasses all the information necessary to interpret, understand and use a given dataset. Discovery metadata more particularly apply to information (keywords) that can be used to identify and locate the data that meet the user's requirements (via a Web browser, a Web based catalogue, etc). Detailed metadata include the additional information necessary for a user to work with the data without reference back to the data provider.

Metadata pertaining to observational data, for example, include details about how (with which instrument or technique), when and where the data have been collected, by whom (including affiliation and contact address or telephone number) and in the framework of which research project. In the case of processed data, the nature of the initial raw data and the derivation process must be stated. The nature and units of the recorded variables are of course essential, as well as the grid or the reference system. Metadata pertaining to model output should include the name of the model, the conditions of the calculation, the nature of its output, the geographical domain over which the output is defined (when applicable), etc. Specific conditions applying to the model or the experiment may be mentioned. Metadata also obviously include information on the format in which the data are stored, the order of the variables, etc, to allow potential users to read them. Metadata pertaining to software models include the key points of the theory on which the model is based, the techniques and computational language used, references, etc.

Since the evaluation of information relevance may widely vary with individuals, some metadata standards have been – and are still currently being – developed with the aim of uniformising metadata presentation. The other advantage of metadata standards is that they ensure the transmission of the information contained in the metadata (and hence the ability to use the data), in some predefined generic way, to remote and future users, provided that the latter will know the adopted conventions. Which in turn requires the existence, maintenance and transmission of manuals describing the set of conventions relevant to a particular metadata standard – some kind of metametadata.

Since a crucial section of the metadata pertains to the data format, different metadata standards have been developed in conjunction with the various data formats. Existing data format standards, and metadata standards alike, are based both on the specific needs of confined scientific communities and on habits already in use within these communities. All of them regularly undergo updates and are susceptible of further evolution. In geosciences and among disciplines where 2-dimensional Earth surface reference systems play an important role (like archæology), the most popular data formats seem to belong to the GIS family (Geographic Information Systems). In the atmospheric research community, key standards formats are the NetCDF binary data format and the NASA Ames Format for Data Exchange. Both have been widely adopted to represent multi-dimensional datasets. The NetCDF Climate and Forecasts (CF) Metadata Convention provides detailed explanation of how to encode metadata and data in NetCDF files for greater data exchange.

2. Metadata for UFAM Datasets

2.1 Metadata for tables of numbers (observations or model output)

2.1.1 Content

Metadata include the following overall information. Some information in this list may be applicable in specific cases only. For accurate instructions about metadata formulation, see
Section 2.1.3 (Format) below.

2.1.2 Storage

Each data file includes a header containing the metadata.

2.1.3 Format

The BADC requires data files to be provided in one of the chosen standard formats:

2.2 Metadata for photographic images (JPEG, GIF, ...)

2.2.1 Content

Metadata of photographic images of the Earth surface include the following. For accurate instructions about metadata formulation, see Section 2.2.3 (Format) below.

2.2.2 Storage

Metadata are stored in separate text files – one metadata file per image file. The metadata file has the same name as the image file except for the file extension, which is .txt.

2.2.3 Format

Photographic images of the should be provided in usual digital formats used for pictures (JPEG, GIF, PNG).

2.3 Metadata for software

2.3.1 Content

Metadata pertaining to a model should include the following.

2.3.2 Storage

Metadata relative to software can be included as comments in the top section of the source file or can alternatively be provided as a separate ASCII (.txt) file.

2.3.3 Format

Text. There is no particular requirement regarding software metadata formatting.

3. Additional documentation

Any additional documentation on recorded data or images, whether pertaining to a single data file or a whole dataset, that would not find its place into the structures described above (because it does not fall into any described category or because it is too voluminous) may be submitted to BADC in the form of a text file that will be stored in the archive documentation directory. These documents may for example include technique description, possible use of the data, study conclusions, etc.