UTLS-OZONE Data Scoping Study
J. A. Kettleborough, L. J. Gray, and S. R. Williams. British Atmospheric Data Centre, Rutherford Appleton Laboratory.
Summary
Purpose
The British Atmospheric Data Centre (BADC) has been designated by NERC as the data centre for archiving data collected or produced as part of the UTLS-OZONE thematic program. All projects funded by NERC under the UTLS-OZONE thematic program have a commitment to archive the data they produce as part of the program at the BADC, if appropriate. The BADC will then be responsible for disseminating the data and provide a long-term archive.
The general objective of the scoping study was to determine the data requirements of the program. Having determined these requirements the study can be used to identify and prioritise the tasks needed to ensure that these requirements are met efficiently and effectively.
The specific objectives of the scoping study were to:
Sources
Most of the information used in compiling this study was acquired using a data questionnaire sent to all round 1 and round 2 funded projects. A copy of the data questionnaire used in the data scoping study can be found in appendix A. Copies of the individual returns to the questionnaire can be obtained from the BADC on request. The returns to the questionnaire have been supplemented by e-mail contact with researchers in the individual projects.
Project Types
One of the challenges of data management for the UTLS project is the diversity of the data sets produced as part of the project. For the purposes of data management the projects can be characterised as one of three types.
Clearly some projects encompass more than one type. For instance field observation projects always include an element of data analysis. The division of projects into the different types is not intended to completely define the aims of any one project: it is intended to act as a simple categorisation to infer where the emphasis for data management tasks will be.
The data management demands for the three types of project are clearly different. Field observations, satellite data apart, tend to produce small data sets confined in time and space. The field observations need both to be disseminated rapidly and to be archived for the long term. Model studies and data analysis often use and produce large data sets. For these projects the 3rd party data requirements, such as meteorological analyses, are often of a higher priority than long term archiving. Long term archiving of the data sets produced by MS and DA projects is of low priority since they have a limited life span as the models evolve and improve. The results of chemical kinetics experiments are very small data sets that can be disseminated either simply by the originating investigator or in published papers. The archiving of chemical kinetics results is obviously important, but is usually achieved by results appearing in published papers.
FOs are the most demanding of the types of project for data management. The BADC has gained experience of dealing with field observations through the ACSOE project. Although it has to be noted that not all of the field observation projects funded under UTLS will fit the ACSOE model. Many of the FO projects are funded under UTLS, but form part of larger field campaigns funded by other bodies such as the EU and NASA. These projects may have commitments to other funding bodies: these might include a commitment to archive data somewhere other than the BADC. The issues raised by the collaborations will be covered later in this document.
Projects
Table 1 lists the projects, with a classification of the project type, and any collaborations and/or prior data commitments. Of the thirteen FO projects nine involve collaborations with external projects, or have prior data commitments. Of the eight MS/DA projects three have the specific task of providing modeling support for non-UTLS projects.
Table 1 UTLS round 1 and 2 projects
Project |
P.I. |
Type |
Collaboration |
Airborne Measurements of Atmospheric Tracers in the UTLS for studies of atmospheric chemistry and transport |
Jones, Pyle, and Gardiner |
FO |
|
Atmospheric Chemistry and transport of Ozone in the UTLS (ACTO) |
Penkett et al. |
FO MS DA |
|
Campaign participation and modeling studies for APE-GAIA |
Chipperfield, Roscoe |
FO MS |
APE-GAIA |
Characterisation of volatile organic compounds in the UTLS |
Pilling, Lewis, Bartle |
FO |
MAXOX |
Development of scientific instrumentation for commercial aircraft |
Jones |
FO |
|
Extension of THESEO balloon-borne measurements of atmospheric tracers and chemically active gases in the mid-latitude lower stratosphere for test of atmospheric transport |
Jones, Pyle, Woods |
FO |
THESEO |
GCM measurements of halogenated source gases in the UTLS region in air samples from CARIBIC flight program |
Penkett, Oram, Sturges |
FO |
CARIBIC |
Improved upper air forecasting and analysis for APE-THESEO |
MacKenzie |
FO MS |
APE THESEO |
Improvements to calculations of lower stratosphere exchange between southern mid-latitudes and Antarctica by radiosonde launches during the Airborne Polar Experiment |
Roscoe, Shanklin |
FO |
APE Antarctic Archive |
Ozone LIDAR investigations of the subtropical jetstream and of subtropical intrusions into mid-latitudes |
Vaughan |
FO |
METRO/ TRACAS |
Support and Analysis for Far-Infrared emission remote sensing |
Hamilton, Ade |
FO |
SAFIRE/IBEX |
The Aberystwyth-Egrett Experiment |
Whiteway, Vaughan |
FO |
|
The role of frontal zones in determining upper troposphere chemical distributions |
Browning et al. |
FO MS |
MAXOX |
A General Circulation Model study of Ozone/Temperature interactions |
Shine, Fish |
MS |
|
Development of a microphysical and chemical model of cirrus clouds |
Choularton |
MS |
|
Evaluation of the Ozone and Water vapour data sets of the 40 year European Re-Analysis of the Global Atmosphere |
Lahoz, O’Neill, Hoskins |
DA |
ECMWF |
Forecast and Analysis of polar stratospheric clouds and cirrus for the NASA SOLVE arctic ozone campaign |
Carslaw |
MS |
SOLVE |
Gas Phase and aerosol composition of air entering the upper troposphere through convection |
Parker, Carslaw |
MS |
|
Studies of the tropopause region using version 5 data from MLS |
Harwood |
DA |
UARS/MLS |
The response of lower stratosphere ozone to solar variability and its impact on radiative forcing and climate |
Haigh, Austin |
MS |
|
Three Dimensional model studies for THESEO |
Chipperfield |
MS |
THESEO |
Laboratory studies of OH production and removal rates for the upper troposphere |
Heard, Pilling |
CK |
|
Laboratory studies of the heterogeneous interaction of pollutants from aircraft – of HNO3, H2O and soot aerosols |
Cox |
CK |
|
Laboratory, theoretical, and modeling studies of Gas phase peroxy radical reactions affecting the UTLS HOx budget |
Rowley, Cox, Clary |
CK |
Campaign Times
Table 2 shows the main times when data will be collected. For projects funded in rounds one and two, most of the data will be collected during the first 3 years of funding. This implies that the initial emphasis for data management should be development and population of the archive. Development of value added products will be delayed until later in the program.
Table 2 Times for Experimental Campaigns
Project |
Campaign |
1998 |
1999 |
2000 |
|||||||||||||||
Hamilton et al |
SAFIRE/IBEX |
||||||||||||||||||
Penkett et al |
CARIBIC |
||||||||||||||||||
Vaughan et al |
TRACAS/METRO |
||||||||||||||||||
McKenzie |
APE-THESEO |
||||||||||||||||||
Browning et al |
MAXOX |
||||||||||||||||||
Jones et al |
THESEO |
||||||||||||||||||
Pilling et al |
MAXOX |
||||||||||||||||||
Chipperfield |
APE-GAIA |
||||||||||||||||||
Carslaw |
SOLVE |
||||||||||||||||||
Jones et al |
EGRETT |
||||||||||||||||||
Whiteway et al |
EGRETT |
||||||||||||||||||
Penkett et al |
ACTO |
Data Sets
The data sets to be collected by UTLS-OZONE projects are listed in Table 3. The list is, at present, incomplete. More details of the various data sets will be added, as the information becomes available.
Table 3 Data Sets produced as part of Field Campaigns
Quantity |
Instrument |
Platform |
Time |
Project |
Data set Size |
CH4 |
GC |
EGRETT |
9912-0002 |
Jones et al. |
|
3 CFCs |
GC |
EGRETT |
9912-0002 |
Jones et al. |
2.5MBytes |
reflectance |
MST Radar |
Ground based |
0101-0201 |
Whiteway et al |
|
O3 |
Chemical Cell |
Balloon |
0101-0201 |
Whiteway et al |
|
O3 |
LIDAR |
Ground based |
0101-0201 |
Whiteway et al |
|
u, v, w, T |
EGRETT |
0101-0201 |
Whiteway et al |
||
u,v,w,T |
C-130 |
00Spring-00Aut. |
Penkett ACTO |
||
O3 |
C-130 |
00Spring-00Aut. |
Penkett ACTO |
||
H20 |
C-130 |
00Spring-00Aut. |
Penkett ACTO |
||
PAN |
C-130 |
00Spring-00Aut. |
Penkett ACTO |
||
>30 NMHCs |
C-130 |
00Spring-00Aut. |
Penkett ACTO |
||
>40 Halocarbons |
C-130 |
00Spring-00Aut. |
Penkett ACTO |
||
DMS |
C-130 |
00Spring-00Aut. |
Penkett ACTO |
||
Acetone |
C-130 |
00Spring-00Aut. |
Penkett ACTO |
||
CO |
C-130 |
00Spring-00Aut. |
Penkett ACTO |
||
NO NO2 |
C-130 |
00Spring-00Aut. |
Penkett ACTO |
||
NOy |
C-130 |
00Spring-00Aut. |
Penkett ACTO |
||
HNO3 |
C-130 |
00Spring-00Aut. |
Penkett ACTO |
||
J(NO2) |
C-130 |
00Spring-00Aut. |
Penkett ACTO |
||
J(O3->O1D) |
C-130 |
00Spring-00Aut. |
Penkett ACTO |
||
Peroxides |
C-130 |
00Spring-00Aut. |
Penkett ACTO |
||
Peroxy radicals |
C-130 |
00Spring-00Aut. |
Penkett ACTO |
||
HCHO |
C-130 |
00Spring-00Aut. |
Penkett ACTO |
||
CH4 |
C-130 |
00Spring-00Aut. |
Penkett ACTO |
||
N2O |
C-130 |
00Spring-00Aut. |
Penkett ACTO |
||
Aerosols |
C-130 |
00Spring-00Aut. |
Penkett ACTO |
400Mbytes |
|
Benzene, Toluene |
GC |
C-130 |
9903-9904, 9908 |
Pilling et al |
|
H2O |
SAW |
C-130 |
Jones |
||
CH4, H2O, CFCs |
GC |
Balloon |
9901-9912 |
Jones et al |
1.2Mbytes |
Halocarbons |
GC |
Airborne |
9801-2010 |
Penkett CARIBIC |
3Mbyte |
u,T |
Radio Sonde |
9909-9910 |
Roscoe |
||
O3 |
Chemical Cell |
Ozone Sonde |
9904-0104 |
Vaughan et al |
2.5Mbyte |
O3 |
LIDAR |
Groud Based |
9904-0104 |
Vaughan et al |
1Mbyte |
Reflectance |
MST Radar |
Ground Based |
9904-0104 |
Vaughan et al |
|
OH,NO2,HOCl,HO2, O3, H2O, HBr, HOBr, BrONO2, HNO3 |
FIR |
Various |
9807, 981115-981215, 990915-991015 |
Hamilton |
|
Reflecatance etc |
Chilbolton Radar |
Ground based |
9901-9905 |
Browning et al |
600Mbytes |
NOx, NOy |
C-130 |
9901-9905 |
Browning et al |
||
HCHO |
C-130 |
9901-9905 |
Browning et al |
||
H2O2, RO2 |
C-130 |
9901-9905 |
Browning et al |
||
CO |
C-130 |
9901-9905 |
Browning et al |
||
Aerosols |
C-130 |
9901-9905 |
Browning et al |
100Mbyte |
Collaborating Projects
As noted earlier, many of the UTLS-OZONE projects are part of collaborations with NASA and EU funded projects. The collaborations raise several questions for the dissemination and archiving of the data collected as part of these projects.
If the collaborating project has its own designated data centre, then, presumably, that data centre will be responsible for the dissemination of data during field campaigns. The main question is then how much of the data archive will subsequently be mirrored at the BADC. Although there is a commitment to archive all UTLS-OZONE funded data at the BADC it may be that it is simply not appropriate, and possibly meaningless, to archive the UTLS-OZONE project data in isolation from the collaborating project data. In these cases it may be that the BADC could act as a mirror to the primary archive, although this obviously would entail negotiation with the data centre of the collaborating project. Alternatively, and probably more practically, the BADC could maintain references to the collaborating project data centre. This would pass responsibility for the maintenance of some of the UTLS-OZONE data archive to another data centre, which may have its own problems.
If the collaborating project does not have its own designated data centre, then the BADC could act as the data centre for the project, if required. The dissemination of UTLS data to other researchers in the collaborating projects can be facilitated by the BADC. Collaborating researchers would be required to sign a data agreement, in a similar way to ACSOE. It would have to be decided, probably on a case by case basis, how much of the UTLS-OZONE archive the external researchers would have access too. Presumably collaborating project researchers would only need access to the data sets produced by the UTLS-OZONE project with which they are collaborating. If the BADC is to facilitate the dissemination of data to collaborating projects, it will be necessary to set up the relevant mechanism in the archive at an early stage.
3rd Party Data
Table 4 lists the 3rd party data requirements of the various UTLS-OZONE projects. The column labelled status indicates whether the project already has access to the required data set. If the data set is already available the column labeled source indicates the source of the data. It should be noted that some data sets have separate agreements and are not necessarily available to the all UTLS projects. These include the UKMO/ECMWF forecasts used during APE-THESEO (McKenzie), and, at least initially, the new 40-year ECMWF-ERA data (Lahoz et al).
Forecast data is required for mission planning during field campaigns. ECMWF forecast data can be obtained through the BADC, but requires an independent letter of application to go to the ECMWF.
Some of the 3rd Party data requirements are already being met by the BADC. These include the ECMWF analyses and UKMO UARS assimilated data. Plans are already underway to meet the UTLS requirement for access to the UKMO Unified Model data. No doubt other 3rd party data requirements will become evident throughout the period of the UTLS-OZONE program.
There is some demand for value added products such as trajectories, meteorological variables on isentropic surfaces, and meteorological variables along flight tracks. The BADC has developed a WWW interface to a trajectory model to help fulfil the demand for trajectory data by the UTLS-OZONE projects. The BADC will also investigate the possibility of adding tools to the BADC archive to calculate other value-added products.
Table 4 Third Party data Requirements of the UTLS-OZONE projects
Project |
Data Required |
Status |
Source |
Vaughan et al |
ECMWF Met. Data |
Semi – no isentropic data or PV |
BADC |
Vaughan et al |
Meteosat Water Vapour |
Not currently available |
|
Pilling et al |
Trajectories |
Yes |
HYSPLIT Model |
Oram et al |
Trajectories |
Yes |
CARIBIC |
McKenzie et al |
ECMWF/UKMO data |
Yes |
UKMO/ ECMWF |
Browning et al |
UKMO Unified model |
Yes |
UKMO-JCMM |
Browning et al |
Network Rain Rate |
Yes |
|
Browning et al |
ECMWF data |
Yes |
BADC |
Parker et al |
Met Data for initialisations |
||
Jones |
T, H2O from UKMO/ECMWF |
Semi – no along flight track data |
BADC |
Harwood et al |
ECMWF/UKMO |
Yes |
BADC UKMO |
Chipperfield |
UKMO UARS assimilations |
Yes |
BADC |
Chipperfield |
ECMWF Reanalysis |
No |
|
Shine et al |
O3 Trends |
Yes |
NASA |
Lahoz et al |
ERA O3 and H2O |
Yes |
ECMWF |
Haigh et al |
SBUV O3 |
Yes |
|
Haigh et al |
SOLSTICE/SUSIM |
Yes |
|
Whiteway |
ECMWF Forecasts |
BADC/ECMWF |
|
Whiteway |
Forecasts |
||
Penkett et al ACTO |
Trajectories |
Yes |
U. Reading/ BADC |
Data File Formats
The adoption of common, standard file formats both eases the dissemination of results and helps ensure the long-term integrity of a data set. Standard formats reduce any ambiguity in a data set and simplify the maintenance of reading and writing software and associated documentation. A good standard data format should be:
As well as being portable across platforms a good standard format will have potential for use in many analysis packages. In deciding the standard file format to be adopted it is important to consider users previous experience and the resources available for analysis.
Figure 1 indicates the experience of researchers funded by UTLS with various established data formats. NASA-Ames Format for Data Exchange is the format that has had most previous usage. This is an ASCII based format ideal for exchange of FO data. Each NASA Ames file consists of a header, which describes the contents of the file, followed by the data.
A binary format is often more useful for storage of large gridded data sets. Although researchers have about equal experience of GRIB and NetCDF format some thought that NetCDF was easier to use. NetCDF is a portable binary format, with access routines in FORTRAN, C, C++, perl, java, and IDL.
Figure 2 shows the analysis/display methods used by the researchers. IDL is the most widely used, although clearly a significant number use Excel and MATLAB.
Given the previous experience of data formats and analysis/plotting packages used UTLS Ozone will adopt NASA-Ames as the standard data format. Large gridded data sets will use NetCDF. NASA-Ames files are readable by IDL, Excel, and MATLAB. NetCDF files are supported by IDL, and some freely available software packages for plotting NetCDF files, such as FERRET. LiveAcess server can also be used to give WWW access to NetCDF files.
Appendix A UTLS-OZONE Data Questionnaire
Project Details
Title of Project: |
|
PI's Name: |
|
Your Name (if not the PI): |
|
Your Address:
|
|
E-mail: |
Data sets that you will produce
2.1 What data sets are you likely to produce as part of your UTLS-Ozone funded work? Please give some indication of the size (to an approximate order of magnitude) of each of these data sets.
2.2 Which of these data sets do you consider might be worth archiving in the long-term?
2.3 What will be the primary times of your data collection and/or model result production?
Start time |
Finish time |
2.4 Do you think that other projects within UTLS-Ozone would find it useful to have access to your data?
Yes |
No |
2.5 Will you need to share data with other institutes?
Within your UTLS project |
With other UTLS projects |
||
Outside UTLS (e.g. with EU partners) |
If so, please give details:
Do you already have plans for doing this, or would you like assistance from the BADC?
Distribute data yourselves |
Use the BADC |
2.6 When do you anticipate submitting data to the BADC archive?
As soon as possible to allow you to use the BADC to share and validate data with other UTLS projects. |
|
When you are happy that the data is in a suitably validated form for other people to use. |
|
At the end of the project. |
|
Not Applicable |
Third Party Data Requirements
3.1 If you anticipate using any third party data (e.g. Met. data or trajectories), please tell us what they are.
3.2 Do you already have access to these data sets? If not how do you anticipate getting them?
3.3 Are there any issues of intellectual property rights that will restrict the availability of these data to other UTLS projects or that will prevent them from being added to the "public" UTLS data sets at the end of the UTLS project?
4 Computing Issues
4.1 Which of these standard data formats do you have experience of working with?
netCDF |
NASA Ames |
GRIB |
HDF |
Please list any other formats that you might like to use:
Would you be prepared to write your data to standard data format(s), if assistance was available from the BADC to help with this job?
4.2 What computer hardware are you likely to use to work on your data? Please tick any appropriate boxes:
Supercomputer |
Linux workstation |
Sun Unix workstation |
Windows 95/98/NT PC |
HP Unix workstation |
Macintosh PC |
DEC Unix workstation |
Other - please specify:
4.3 If you intend to use commercial software to manipulate your data, for example, IDL, PV-Wave or Excel, please list them in the space below:
Anything Else?
Are there any other UTLS-Ozone data issues with which the BADC could help you? Do you have any other comments about the UTLS-Ozone data management? Are you happy with the UTLS-Ozone data protocol as it stands at present?