NASA-Ames format for data exchange


UTLS-OZONE has adopted NASA-Ames Format for Data Exchange as the file format for field observation data. NASA-Ames format is a text based format, that is self describing and portable, but at the same time flexible. The definition of the format is given in the document Format Specification for Data Exchange by Gaines and Hipskind. This document is intended as a short introduction to the NASA-Ames format for use by the UTLS-OZONE thematic program.

Contents

Introduction

The NASA Ames Format for Data Exchange, often referred to as simply NASA Ames format, grew out of NASA Aircraft campaigns. The format was developed taking into account the following considerations:

Each NASA Ames file is made up of the first line which contains information on the file type, a file header section, and a data section. The file header contains the information needed to make the file self describing, as well as giving information such as the origin of the data. Once the form of a file for a particular instrument has been decided on the file header for that instrument has only a few changes from file to file. More information is given in on the file header in the section The file header. The data section lists the data, in a column oriented format. More information on the data section is given in the section The Data Section.

File names

The requirement that NASA Ames files are portable puts some constraints on the File names since some operating systems, e.g. DOS, limit the length of allowed File names. The agreed convention for NASA Ames files is ppyymmdd.ext, where pp is a two letter prefix, yymmdd is a date, and ext is an extension. The prefix and extension can be used to identify site, platform, instrument and constituent. The way in which the prefix and extensions are used is left to the individual experimental campaigns, and is not specified as part of the NASA Ames standard.

Variables

The observed data can be represented in NASA Ames files as numeric or character data. Clearly the most commonly used is numeric data, though character data can be used for site names etc. The allowed characters in numeric fields are 0-9, +, -, ., and E. The character data can use any printable ASCII character (ASCII codes 32 to 126).

NASA Ames uses three kinds of variables: INDEPENDENT, PRIMARY and AUXILIARY. The INDEPENDENT variable(s) are used to define the dimensions of the data, and so must be monotonic. There may be more than one primary variable, and so NASA Ames format can accommodate multi-dimensional data. For most UTLS-OZONE field campaign data, however there will be only one INDEPENDENT variable in the file, and for most of the files this will be time. For each NASA Ames files there can be one, and only one, unbounded INDEPENDENT variable.

PRIMARY variables are the main variables that are in the file: they are the quantities that have been observed or derived e.g. temperature, winds, ozone. There can be more than one PRIMARY variable in each NASA Ames file. This allows for the inclusion of more than one observed field in each file. The PRIMARY variables are considered as functions of the INDEPENDENT variables.

The AUXILIARY variables are used for ancillary information about the observations. They may, for instance, be used to represent additional dimension information: the INDEPENDENT variable is time but AUXILIARY variables are used to represent geographical position and height. For most files produced as part of UTLS-OZONE there will be no AUXILIARY variables.

The PRIMARY and AUXILIARY variables have scale factors and missing value flags associated with them. The scale factor can, of course, be 1.0. The missing value should be larger than any good data value within the file. The scale factors and missing values are defined in the File Header.

File Format Index

The first line of every NASA Ames file records the length of the file header (including the first line) and a File Format Index. The File Format Index identifies the type of NASA Ames file: it specifies the number of INDEPENDENT variables (dimensions), the increment of the INDEPENDENT variables, the nature (numeric or character) of the variables, and whether there are AUXILIARY variables present. The File Format Index is a four digit integer, the most significant digit gives the number of INDEPENDENT variables in the file. For most UTLS-OZONE files the File Format Index is 1001. This File Format Index defines files with one unbounded real independent variable with real primary variables and no AUXILIARY variables. If other File Format Indices are needed, please contact the BADC.

The file header

The file header contains the information which defines the contents of the file. It consists of several pieces of information or subsections. The details of the file header are determined by the File Format Index. But all file headers contain the following information:

The first line of the file, as well as giving the File Format Index, gives the number of lines in the header. This allows a quick skip of the header when reading files.

The order and format of the Data Originator, Storage information, and Date information is identical for all File Format Indices. The variable descriptions are File Format Index specific. The special comments and normal comments are simple text sections reserved for adding comments to the files. Each of the comment sections begins with a line indicating the number of lines in the text section. Suggested contents of the comments sections are give in UTLS-OZONE Suggested Metadata Standards.

The description of the INDEPENDENT variables includes the interval between successive values, the name, and the number of values in the independent variable. The descriptions of the PRIMARY and AUXILIARY variables include the variable names, the scaling factors and any missing values.

The description of the file header given here is best fleshed out with examples. It should be noted that any line in the file header can have in-line comments. These are used to annotate the example files.

The Data Section

The format of the data section is determined by the File Format Index. For the details applied to File Format Indices other than 1001 it is best to refer to Gaines and Hipskind. For File Format Index 1001 the data section is simply columns of figures. Each column representing a different variable. The first column giving the values of the INDEPENDENT variables. The other columns list the values of the PRIMARY variables.