Quality Assurance Checks and Data-Processing Activities Performed by CDIAC
An important part of the data documentation and dissemination process at CDIAC is the quality
assurance (QA) of data before distribution. Data received at CDIAC are rarely in perfect condition for
immediate distribution, regardless of the source. To guarantee data of the highest possible quality, CDIAC
conducts extensive QA reviews, which involve examining the data for completeness, reasonableness, and
accuracy. Although these reviews have common objectives, they are tailored to each data set, often requiring
extensive programming efforts. This time-consuming process is an important component in the value-added
concept of assuring accurate, usable data for researchers.
The NOAA/CMDL flask CO2 database contains CO2 measurements and other parameters from
many sites. That only a few minor problems were discovered by CDIAC reflects the considerable effort and
scrutiny exerted by the NOAA/CMDL Carbon Cycle Group in providing high-quality, well-documented,
consistently-formatted data to an international scientific audience. The few problems encountered by CDIAC
were quickly addressed and resolved by the NOAA/CMDL Carbon Cycle Group. The following summarizes
the QA checks and data-processing activities performed by CDIAC.
QA Checks
CDIAC obtained the original NOAA/CMDL flask CO2 database from the NOAA/CMDL Carbon Cycle
Group anonymous FTP area as two UNIX "tar" files. These files were transferred to CDIAC using FTP
commands and exploded (i.e., untarred). Working copies of the files were created and processed in the
following ways:
- All data files contributed by the NOAA/CMDL Carbon Cycle Group were checked to ensure that each was
formatted as stated, contained the data described, and contained the period of record specified.
- Each file was checked to ensure that the prescribed missing value conventions (i.e., -999.99 for
CO2 mixing ratios and 99 for date parameters) were consistent throughout all files and that
no other missing value designations were used in the files.
- Frequencies of occurrence were generated for the instrument codes and data selection codes to
assess the abundances of each code, check for bogus codes, and permit documentation of all possible
codes.
- Mean values were generated for each numeric variable in each data file and these values were
checked for reasonableness (e.g., a range of month values from 1 to 12).
- All data were plotted. Extreme values were identified, and these values were traced to the
original data files to ensure that nonbackground flag codes were associated with each value.
Data Processing
- CDIAC did not alter the format of the NOAA/CMDL flask CO2 database files. The files
distributed by CDIAC are identical in format to the files distributed by NOAA/CMDL.
- To assist users wishing to retrieve and process fewer files, two files were created by CDIAC
from the >100 files distributed by NOAA/CMDL. One (all.co2) contains CO2
mixing ratios from all individual flask air samples for all sites except the shipboard measurements.
The second file ( allmm.co2 ) contains the monthly atmospheric CO2 measurements for all sites,
again excluding the shipboard measurements.
- The annual values shown in the data listings in Appendix B were generated by CDIAC for those
wishing to have annual atmospheric CO2 mixing ratios. Annual values were calculated
arithmetically for years having all 12 monthly values. These values are not provided in the
machine-readable data files.
Previous |
Continue |
Access Data |
Table of Contents
CDIAC Home Page |
E-mail CDIAC