UNITED STATES HISTORICAL CLIMATOLOGY NETWORK DAILY TEMPERATURE, PRECIPITATION, AND SNOW DATA Contributed by: C. N. Williams, Jr., R. S. Vose, D. R. Easterling, and M. J. Menne National Oceanic and Atmospheric Administration National Climatic Data Center Asheville, North Carolina Documentation Prepared by: Dale P. Kaiser and Linda J. Allison Carbon Dioxide Information Analysis Center Environmental Sciences Division OAK RIDGE NATIONAL LABORATORY (ORNL) Oak Ridge, Tennessee 37831-6335 managed by UT-Battelle, LLC for the U.S. DEPARTMENT OF ENERGY under contract DE-AC05-00OR22725 August 2006 Contents Abstract Name of the Numeric Data Package Principal Investigators Keywords 1. Background Information 2. Description of the Database 3. Data Inhomogeneities and Nonclimatic Influences in the Data 4. Quality assurance of the HCN/D database NCDC QA Checks and Adjustments CDIAC QA Checks and Modifications 5. File Descriptions FORTRAN AND SAS Data Retrieval Programs Station Inventory File for Daily USHCN Station History File HCN/D Data Files 6. How to Obtain the Data and Documentation 7. References Appendix B: State Numbers and abbreviations Used for the 48 States in the HCN/D Database Abstract Williams, C. N., R. S. Vose, D. R. Easterling, and M. J. Menne, 2006. United States Historical Climatology Network Daily Temperature, Precipitation, and Snow Data. ORNL/CDIAC-118, NDP-070. Carbon Dioxide Information Analysis Center, Oak Ridge National Laboratory, Oak Ridge, Tennessee. doi: 10.3334/CDIAC/cli.ndp070 This document describes a database containing daily observations of maximum and minimum temperature, precipitation amount, snowfall amount, and snow depth from 1062 observing stations across the contiguous United States. These stations are a subset of the 1221-station U.S. Historical Climatology Network (HCN), a monthly database compiled by the National Climatic Data Center (Asheville, North Carolina) that has been widely used in analyzing U.S. climate. The earliest station record begins in 1871 (Charleston, South Carolina); records from 158 stations begin prior to 1900. Data from 1005 of these daily records extend through 2000, while 920 station records extend through 2005. Most station records are essentially complete for at least 50 years; the latest beginning year of record is 1948. The daily resolution of these data makes them extremely valuable for studies attempting to detect and monitor long-term climatic changes on a regional scale. Studies using daily data may be able to detect changes in regional climate that would not be apparent from analysis of monthly temperature and precipitation data. Such studies may include analyses of trends in maximum and minimum temperatures, temperature extremes, daily temperature range, precipitation "event size" frequency, and the magnitude and duration of wet and dry periods. The data are also valuable in areas such as regional climate model validation and climate change impact assessment. This database is available free of charge from CDIAC as a numeric data package (NDP). This file describes the HCN/D station network and gives details of the format and content of all files. Keywords: United States; HCN; HCN/D; historical; climate; climatology; daily data; temperature; maximum temperature; minimum temperature; precipitation; snowfall; snow depth 1. Background Information Over the past few decades, numerous global, hemispheric, and regional meteorological databases have been assembled for use in studying the nature and variability of the earth's climate. This work has been largely inspired by growing international concern over the potential climatic impacts of increasing atmospheric concentrations of greenhouse gases. While the parameters important in the study of climate change are myriad, those that seem to have received the most attention are near- surface air temperature (herein referred to as temperature) and precipitation. There are many reasons for this, including (1) the spatial and temporal variability of these parameters affects ecosystems, agriculture, water resources, human health, and energy needs and consumption; (2) instrumental records of these variables are relatively long, beginning in the 1800s in many regions of the northern hemisphere; and (3) analyses of temperature data from around the globe show an increase in global mean surface temperature of about and 0.6 deg C since the late 19th century (IPCC 2001). The suitability of modern historical temperature and precipitation data for climate change studies depends on their reliability and accuracy. Most records of significant length, regardless of source, are likely to contain biases or inhomogeneities resulting from changes in the environment or operation of individual observing sites (e.g., urbanization, station moves, and instrument and time of observation changes). The process of identifying and removing these nonclimatic effects is complex and tedious, and has been undertaken on large scales in such studies as Jones (1994), Jones et al. (1986; 1997), Vinnikov et al. (1990), Peterson and Vose (1997), and Quinlan et al. (1987). The work of Quinlan et al. (1987) involved the compilation of a database containing monthly temperature and precipitation data from a network of 1219 U.S. stations known as the Historical Climatology Network (HCN). The compilation was performed at the National Climatic Data Center (NCDC) of the National Oceanic and Atmospheric Administration (NOAA) in Asheville, North Carolina, and sponsored by the Carbon Dioxide Research Program of the U.S. Department of Energy. The project arose from the need for an accurate, unbiased, and modern historical climate record suitable for detecting and monitoring secular changes in regional climate in the contiguous United States. The quality of the HCN data was enhanced with the use of outlier and areal edits, and the data were corrected for time of observation differences, instrument changes, instrument moves, station relocations, and urbanization effects (Karl et al. 1986; Karl and Williams 1987). The HCN has been updated several times since its inception, most recently by Williams et al. (2004). Some of the stations in the HCN are first-order weather stations, but the majority were selected from approximately 5000 U.S. cooperative weather stations. The first database released by NCDC to contain daily data from HCN stations, the HCN/Daily (HCN/D; Hughes et al. 1992; hereafter H92) contained daily maximum and minimum temperatures and precipitation totals from 138 select U.S. stations. The temperature and precipitation records from these stations were considered to be the most reliable, internally consistent, and unbiased records from the HCN. These records were compiled from digital and nondigital data sets archived at NCDC that come from a variety of sources, including climatological publications, universities, federal agencies, individuals, and data archives. Records were subjected to extensive manual and automated quality assurance (QA) checks. The selection of stations for inclusion in H92 was performed with the following data quality issues in mind. 1. The degree to which each station maintained a constant observation time for maximum and minimum temperatures, excursions from a station's predominant observing time of no more than four years being desired. 2. At least 95% of a station's pre-1951 data should be contained in NCDC digital daily archives. 3. A station's potential for heat island bias over time should be low. 4. Quality assessments based upon the decile ranking assigned by Karl et al. (1990) to the stations' monthly maximum/minimum temperature data for certain quality characteristics. Since the release of H92, much more work has been conducted at NCDC involving compilation and digitizing of daily data. However, to enable the compilation of a database providing better spatial coverage of the contiguous United States, the four station selection criteria listed above were not strictly adhered to in later versions of the HCN/D. 2. Description of the Database The data presented in this package are daily observations of maximum and minimum temperature, precipitation amount (liquid equivalent), snowfall amount, and snow depth from 1062 of the 1221 stations comprising the HCN. Data from 1005 of these daily records extend through 2000, while 920 of these extend through 2005. Most station records are essentially complete for at least 50 years; the latest beginning year of record is 1948. Records from 158 stations begin prior to 1900, with that of Charleston, South Carolina beginning the earliest (1871). While the stations selected in H92 were determined to be superior with regard to the above station selection criteria, the resulting network's spatial coverage of the United States was lacking in several regions. By including many more stations (mainly from the U.S. cooperative station network), and performing the needed QA checks, coverage has now been vastly improved. This figure shows the distribution of the 1062 HCN/D stations. All of the contiguous 48 states are represented by stations in the database. 3. Data Inhomogeneities and Nonclimatic Influences in the Data For users to correctly interpret records from the HCN/D in their analyses, it is important to describe a few caveats inherent in the recording of daily meteorological data in the United States. These relate primarily to observations of maximum and minimum temperature at U.S. cooperative network stations. As pointed out in Sect. 2, the criterion deemed most important in the H92 station selection process was the degree to which a station maintained a constant observing time, i.e., a fixed observing "day," for maximum and minimum temperatures. The importance of maintaining a consistent schedule for observing daily maximum and minimum temperature has been illustrated by several studies, such as Mitchell (1958), Baker (1975), and Schaal and Dale (1977). These studies examined the effects of changing observation time on the daily mean temperature, customarily determined for U.S. stations by adding the maximum and minimum temperature observed over a prescribed 24-hour observing day and dividing by 2. At first-order National Weather Service (NWS) stations (some of which are included in the HCN/D), the 24-hour observing day ends at or near local midnight. Monthly and annual mean temperatures derived using the mean of the daily maximum and minimum temperatures from such stations have been shown by Baker (1975) and Mitchell (1958) to correspond closely with those computed using the stations' hourly observations. While this evidence lends clear support to the practice of ending the observing day at midnight, cooperative observers (comprising most of the HCN/D stations) generally do not take readings at this hour. Most end their observing day in the late afternoon or early evening, with a smaller number choosing a time between 0700 and 0800 local standard time (LST). The systematic biases introduced to the daily means by varying observing times can have far-reaching effects, as the daily mean temperatures form the basis of monthly and annual mean temperatures, and also monthly, seasonal, and annual heating degree days, cooling degree days, and growing degree days. Information on the LST of maximum/minimum temperature observations at each station is contained in the station history file for the HCN/D which is described in Sect. 5. Users are strongly urged to make use of this time of observation information in analyses where homogeneity of observing practices across a network of selected stations would be considered important. While combining daily temperature (or precipitation) data from stations which use different observing days complicates data compilation and quality control and also distorts areal patterns, of perhaps more fundamental importance is the degree of homogeneity over time of observing practices at individual stations and the associated implications for studies of climatic trends. Many stations in the HCN/D depart to varying degrees from a single, fixed observation time for maximum and minimum temperatures over their period of record (see the station history file). This often results from observing responsibilities being transferred between individuals and may even result in a station moving to a new location in the area (relocation information is also contained in the station history file). The user is referred to the work of Mitchell (1958), Baker (1975), and Schaal and Dale (1977) for detailed illustrations of how such changes in observing time are likely to bias calculations involving maximum and minimum temperature data. Two main conclusions common to all three studies are (1) mean temperature calculations using 24-hour maximum/minimum temperatures from PM observations are biased high with respect to midnight observations, while those from AM observations are biased low, and (2) the magnitude of these biases is dependent upon time of year and a station's climatic regime. Another factor users should be aware of pertains to thermometers used at the HCN/D stations. In 1984, the NWS introduced a new Maximum/Minimum Temperature System (MMTS) at cooperative network observing stations. Through 1994, 645 out of 1062 (about 60%) of the HCN/D stations had installed an MMTS (the station history file identifies these stations). Concerns have arisen about the calibration of this system as compared to that of the earlier thermometric system. The new system is thermistor- based with a "beehive like" instrument shelter, whereas the older systems consisted of liquid-in-glass thermometers, mounted inside a Cotton Region Shelter (Stevenson Screen). Quayle et al. (1991) looked into performance differences of the two systems and found that the new system produces maximum temperatures about 0.3 deg C lower and minimum temperatures about 0.4 deg C higher than the old system. Unfortunately, because large samples of side-by-side overlapping measurements are not available, site-specific corrections cannot yet be derived and only large-scale temperature changes can be corrected. Furthermore, daily biases, which are likely to be dependent on synoptic conditions, are unlikely to be the same from day to day. Thus, to date there has been no attempt to adjust the daily temperature data from the HCN/D for these instrument-induced biases. In summary, while the HCN/D stations represent the best long-term climate records available for the contiguous U.S., no station is completely free of changes that could possibly affect its instrumental record; therefore, it is recommended that users make full use of the information contained in the station histories when performing analyses with these data. The data have not been adjusted for station relocations, heat island effects, instrument changes, or time of observation biases. The nature of inhomogeneities arising from such factors depends on a station's climatic regime. 4. Quality Assurance of the HCN/D Database An important part of the numeric data packaging process at CDIAC is the quality assurance (QA) of data before distribution. Data received at CDIAC are rarely in perfect condition for immediate distribution, regardless of their source. To guarantee data of the highest possible quality, CDIAC conducts extensive QA reviews. Reviews involve examining the data for completeness, reasonableness, and accuracy. Although they have common objectives, these reviews are tailored to each data set, often requiring extensive programming efforts. Although time-consuming, the QA process is an important component in the value-added concept of ensuring accurate, usable data for researchers. Through the years, NCDC has conducted extensive manual and automated QA assessments of the HCN/D data. Although the data sent by NCDC were in excellent condition, CDIAC still conducted QA checks on the data and found some minor discrepancies. The following summarizes the major aspects of QA work performed by NCDC and CDIAC, respectively. Users may also find additional details of QA work performed at NCDC in NCDC's Summary of the Day (SOD, TD-3200) documentation, available over the internet via NCDC's web site (http://www.ncdc.noaa.gov). (From the NCDC homepage, use the search feature to search for "Summary of the Day" or "TD-3200".) NCDC QA Checks and Adjustments A general overview of the history of HCN/D QA efforts conducted at NCDC, paraphrased from NCDC's SOD documentation, is as follows. In 1982, historical data were converted from various digital files to an "element" (observation type; e.g., maximum temperature, precipitation amount, etc.) type of file structure. At the time, data were only processed through a gross value check. Shortly thereafter, NCDC instituted a greatly enhanced computer algorithm for operational, automated validation of digital archives. The revised edit system performed internal consistency checks and evaluated against surrounding stations in addition to climatological limits and serial checks. Quality control flags were appended to each element to show how they fared during the edit procedures and to indicate what, if any, action was taken. Prior to 1982, the files consisted only of raw, observed data values; both observed and edited values (as necessary) have been supplied from 1982 onward. Since 1982, the operational edit system at NCDC has evolved into a Geographical Edit and Analysis (GEA) expert system which affords interactive graphics presentations for the human editors. As of 1991, additional capabilities to detect systematic errors in the daily data have been incorporated using the Validation of Historical Daily Data (ValHiDD) system. Furthermore, in November 1993, the entire historical period of record was independently processed (no human editing) through the ValHIDD system for five data elements (the five variables included in NDP-070). Hence, the entire period of record for these elements now comprises observed (raw) and edited values. The following is a list of items from H92 that constitute some of the main human and automated QA checks performed on the data by NCDC. 1. Monthly mean values of maximum and minimum temperature, computed from the HCN/D data, were compared to their respective unadjusted monthly means from the HCN. All conflicts were investigated and resolved, with verification based on manuscript or published sources. 2. Checks were performed to ensure that no monthly mean values of maximum and minimum temperature calculated from a station's daily data were above (below) the monthly state extremes of maximum (minimum) temperature. 3. Any daily precipitation total exceeding 5 in. was verified against manuscript or published sources. 4. Checks were implemented to ensure that maximum temperatures were never less than minimum temperatures on the day of occurrence, the preceding day, and the following day. Conversely, checks were performed to ensure that minimum temperatures were never greater than maximum temperatures on the day of occurrence, the preceding day, and the following day. 5. Temperature data from stations that took readings during the morning over some period have been checked for any date shifting resulting from observers assigning readings to the calendar day of occurrence (the previous day in the case of maximum temperature) rather than the observation day. Such readings were switched back to the day of observance as part of the manual QA checks on the HCN/D data. CDIAC QA Checks and Modifications 1. Because each record in an HCN/D file contains 31 daily data elements (to allow for 31 days in a month), elements pertaining to nonexistent dates were checked to ensure that they contained missing data indicators with blank flag spaces (the prescribed conventions). 2. A few data measurement and data quality flags were found in the data that are not detailed in NCDC's SOD documentation. Records containing these were submitted to NCDC. In some cases, consultation with NCDC determined that these flags were due to corrupted data elements, which have since been corrected. The meanings of a few unknown data measurement flags were not able to be resolved by NCDC. NCDC acknowledges these flag caveats in the following passage from the SOD documentation: "Other values occasionally appear in Data Measurement Flag 1 for which documentation is not currently available, e.g., "C" and "s"." 3. All data records were checked to ensure that the number of days in the month (specified in each record) was correct for the year and month of each record. 5. File Descriptions This section describes the files contained in this database. In addition to the 48 actual daily data files (one per state), there are station inventory and station history files and FORTRAN and SAS codes for reading the the various files. FORTRAN AND SAS Data Retrieval Programs These files are provided for the benefit of users with FORTRAN or SAS on their systems, enabling them to read any of the data files in this database using these software packages. The program files are: invent.for - a FORTRAN program for reading the station inventory file (invent.txt) history.for - a FORTRAN program for reading the station history file (history.txt) data.for - a FORTRAN program for reading any of the 48 daily data files invent.sas - a SAS program for reading the station inventory file (invent.txt) history.sas - a SAS program for reading the station history file (history.txt) data.sas - a SAS program for reading any of the 48 daily data files. Station Inventory File for the Daily USHCN (invent.txt) The station inventory file for the HCN/D data set is sorted by two- digit state code (see Appendix B for a list of these codes) and four- digit Cooperative Network Index (CNI), with one record per station containing two-letter state abbreviation, two-digit state code, four- digit CNI, station name, latitude and longitude (both in decimal degrees), elevation (ft), and five columns containing the beginning month and year of daily observations of maximum temperature, minimum temperature, precipitation amount, snowfall amount, and snow depth. The file may be read using the following FORTRAN format: INTEGER STCODE,CNI,ELEV,MOTMAX,YRTMAX,MOTMIN,YRTMIN, +MOPRCP,YRPRCP,MOSNOW,YRSNOW,MOSNWD,YRSNWD REAL LAT,LON CHARACTER*2 STATE CHARACTER*30 STNAME READ(5,100,END=99)STATE,STCODE,CNI,STNAME,LAT,LON, +ELEV,MOTMAX,YRTMAX,MOTMIN,YRTMIN,MOPRCP,YRPRCP, +MOSNOW,YRSNOW,MOSNWD,YRSNWD 100 FORMAT(A2,1X,I2,I4,1X,A30,1X,F5.2,1X,F7.2,2X, + I4,5(1X,I2,1X,I4) or by using the SAS format: DATA INVENT; LENGTH STNAME $ 30; INFILE 'INVENT.TXT'; INPUT STATE $ 1-2 STCODE 4-5 CNI 6-9 STNAME $ 11-40 LAT 42-46 LON 48-54 ELEV 57-60 MOTMAX 62-63 YRTMAX 65-68 MOTMIN 70-71 YRTMIN 73-76 MOPRCP 78-79 YRPRCP 81-84 MOSNOW 86-87 YRSNOW 89-92 MOSNWD 94-95 YRSNWD 97-100; Stated in tabular form, the contents of the station inventory file include the following. Variable Variable Starting Ending Variable type width column column STATE Character 2 1 2 STCODE Numeric 2 4 5 CNI Numeric 4 6 9 STNAME Character 30 11 40 LAT Numeric 5 42 46 LON Numeric 7 48 54 ELEV Numeric 4 57 60 MOTMAX Numeric 2 62 63 YRTMAX Numeric 4 65 68 MOTMIN Numeric 2 70 71 YRTMIN Numeric 4 73 76 MOPRCP Numeric 2 78 79 YRPRCP Numeric 4 81 84 MOSNOW Numeric 2 86 87 YRSNOW Numeric 4 89 92 MOSNWD Numeric 2 94 95 YRSNWD Numeric 4 97 100 where STATE is the two-letter state abbreviation; STCODE is the two-digit state code (01-48); CNI is the four-digit Cooperative Network Index; STNAME is the station name; LAT is the latitude of the station in decimal degrees north; LON is the latitude of the station in degrees west; ELEV is the elevation of the station in feet; MOTMAX is the beginning month of the daily maximum temperature record for a station; YRTMAX is the beginning year of the daily maximum temperature record for a station; MOTMIN is the beginning month of the daily minimum temperature record for a station; YRTMIN is the beginning year of the daily minimum temperature record for a station; MOPRCP is the beginning month of the daily precipitation amount record for a station; YRPRCP is the beginning year of the daily precipitation amount record for a station; MOSNOW is the beginning month of the daily snowfall record for a station; YRSNOW is the beginning year of the daily snowfall record for a station; MOSNWD is the beginning month of the daily snow depth record for a station; and YRSNWD is the beginning year of the daily snow depth record for a station. Station History File (history.txt) The station history file provides valuable information concerning each station in the HCN/D. This file documents station moves and instrument changes, lists station observers and observation times, and identifies suspect fields. For each station in the file there is an identification record followed by multiple data records describing station observing details over its period of record. The file may be read using the following FORTRAN code: CHARACTER*2 STATE CHARACTER*30 CURRNAME CHARACTER*16 COUNTY CHARACTER*25 XREF CHARACTER*150 BLANK CHARACTER*1 STATUS, DISTUNIT, POUNIT CHARACTER*6 LATNORTH CHARACTER*7 LONGWEST CHARACTER*3 DIRECT, DIRECTPO CHARACTER*28 NAME CHARACTER*10 QUALIF CHARACTER*4 TIMEOBS CHARACTER*2 PCPHT, PCTHT CHARACTER*46 OBSNAME INTEGER STANUM, STANUM2, DIVISION, MOBEG, DAYBEG, YRBEG INTEGER MOEND, DAYEND, YREND, SUSP(15), DISTANCE, ELEV INTEGER DISTPO, INSTRU(36), PUB(16), NUMOBS OPEN(UNIT=5,FILE='history.txt') 10 READ (5,100) STANUM, STATE, STATUS, DIVISION, CURRNAME, 1 COUNTY, XREF, BLANK 20 READ (5,110,END=999) STANUM2 BACKSPACE 5 IF (STANUM .NE. STANUM2) GOTO 10 READ (5,115) STANUM2, MOBEG, DAYBEG, YRBEG, 1 MOEND, DAYEND, YREND, (SUSP(I),I=1,15), LATNORTH, 1 LONGWEST, DISTANCE, DISTUNIT, DIRECT, 1 ELEV, DISTPO, POUNIT, DIRECTPO, NAME, QUALIF, 1 (INSTRU(I),I=1,36), TIMEOBS, PCPHT, PCTHT, 1 (PUB(I),I=1,16), OBSNAME, NUMOBS GOTO 20 100 FORMAT(I6,1X,A2,A1,I2,1X,A30,1X,A16,1X,A25,A150) 110 FORMAT(I6) 115 FORMAT(I6,2(2(1X,I2),1X,I4),1X,15A1,1X,A6,1X,A7,1X, 1 I3,A1,A3,1X,I5,1X,I4,A1,A3,1X,A28,1X,A10,1X,36A1, 1 1X,A4,1X,2A2,1X,16A1,1X,A46,1X,I2) 999 CLOSE(UNIT=5) STOP END or using the SAS code: DATA HISTORY (DROP=X); RETAIN STANUM STATE STATUS DIVISION CURRNAME COUNTY XREF BLANK; INFILE 'history.txt' MISSOVER LS=236; INPUT @45 x $1. @; IF X NE ' ' THEN DO; INPUT STANUM 1-6 STATE $ 8-9 STATUS $ 10 DIVISION 11-12 CURRNAME $ 14-43 COUNTY $ 45-60 XREF $ 62-86 BLANK $ 150; END; ELSE INPUT STANUM2 1-6 MOBEG 8-9 DAYBEG 11-12 YRBEG 14-17 MOEND 19-20 DAYEND 22-23 YREND 25-28 SUSPLAT 30 SUSPLONG 31 SUSPLOC 32 SUSPELEV 33 SUSPPO 34 SUSPNAME 35 SUSPQUAL 36 SUSPINST 37 SUSPTIME 38 SUSPHTS 39 SUSPPUBS 40 SUSPBEG 41 SUSPEND 42 SUSPOBS 43 SUSPOTHR 44 LATNORTH $ 46-51 LONGWEST $ 53-59 DISTANCE 61-63 DISTUNIT $ 64 DIRECT $ 65-67 ELEV 69-73 DISTPO 75-78 POUNIT $ 79 DIRECTPO $ 80-82 NAME $ 84-111 QUALIF $ 113-122 ADDINST 124 COTTON 125 DBULB 126 EVAPSTA 127 FISHPORT 128 HYGRO 129 MINTHERM 130 MAXTHERM 131 NORIVGAG 132 RAINGAGE 133 SHELTER 134 RECRIVER 135 RECRAIN 136 SNOW 137 STORAGE 138 STDRAIN 139 STDSHELT 140 THERMOGR 141 DIGTHERM 142 TIPBUCK 143 OTHEVAP 144 MAXMIN 145 TELSYS 146 HYGRO 147 HY6 148 HY8 149 SFP 150 SRRNG 151 SSG 152 SSRG 153 STB 154 AMOS 155 AUTOB 156 PSY 157 TIMEOBS $ 161-164 PCPHT $ 166-167 PCTHT $ 168-169 BULLETW 171 COMBBUL 172 CLIMDATA 173 RIVSTAGE 174 HYDROBUL 175 PRECDATA 176 SNOWBULL 177 NOTPUB 178 CWB 179 MONTHREV 180 STATEPUB 181 LCD 182 BQ 183 SGPD 184 WWR 185 MYB 186 OBSNAME $ 188-233 NUMOBS 235-236; RUN; Stated in tabular form, the contents of the station history file include the following: Variable Variable Starting Ending Width type column column Identification record: X Alphanumeric 1 45 45 STANUM Numeric 6 1 6 STATE Character 2 8 9 STATUS Character 1 10 10 DIVISION Numeric 2 11 12 CURRNAME Alphanumeric 30 14 43 COUNTY Alphanumeric 16 45 60 XREF Alphanumeric 25 62 86 BLANK Character 87 236 150 Data record: STANUM2 Numeric 6 1 6 MOBEG Numeric 2 8 9 DAYBEG Numeric 2 11 12 YRBEG Numeric 4 14 17 MOEND Numeric 2 19 20 DAYEND Numeric 2 22 23 YREND Numeric 4 25 28 SUSPLAT Numeric 1 30 30 SUSPLONG Numeric 1 31 31 SUSPLOC Numeric 1 32 32 SUSPELEV Numeric 1 33 33 SUSPPO Numeric 1 34 34 SUSPNAME Numeric 1 35 35 SUSPQUAL Numeric 1 36 36 SUSPINST Numeric 1 37 37 SUSPTIME Numeric 1 38 38 SUSPHTS Numeric 1 39 39 SUSPPUBS Numeric 1 40 40 SUSPBEG Numeric 1 41 41 SUSPEND Numeric 1 42 42 SUSPOBS Numeric 1 43 43 SUSPOTHR Numeric 1 44 44 LATNORTH Alphanumeric 6 46 51 LONGWEST Alphanumeric 7 53 59 DISTANCE Numeric 3 61 63 DISTUNIT Character 1 64 64 DIRECT Alphanumeric 3 65 67 ELEV Numeric 5 69 73 DISTPO Numeric 3 75 78 POUNIT Character 1 79 79 DIRECTPO Alphanumeric 3 80 82 NAME Character 28 84 111 QUALIF Alphanumeric 10 113 122 ADDINST Numeric 1 124 124 COTTON Numeric 1 125 125 DBULB Numeric 1 126 126 EVAPSTA Numeric 1 127 127 FISHPORT Numeric 1 128 128 HYGRO Numeric 1 129 129 MINTHERM Numeric 1 130 130 MAXTHERM Numeric 1 131 131 NORIVGAG Numeric 1 132 132 RAINGAGE Numeric 1 133 133 SHELTER Numeric 1 134 134 RECRIVER Numeric 1 135 135 RECRAIN Numeric 1 136 136 SNOW Numeric 1 137 137 STORAGE Numeric 1 138 138 STDRAIN Numeric 1 139 139 STDSHELT Numeric 1 140 140 THERMOGR Numeric 1 141 141 DIGTHERM Numeric 1 142 142 TIPBUCK Numeric 1 143 143 OTHEVAP Numeric 1 144 144 MAXMIN Numeric 1 145 145 TELSY Numeric 1 146 146 HYGRO Numeric 1 147 147 HY6 Numeric 1 148 148 HY8 Numeric 1 149 149 SFP Numeric 1 150 150 SRRNG Numeric 1 151 151 SSG Numeric 1 152 152 SSRG Numeric 1 153 153 STB Numeric 1 154 154 AMOS Numeric 1 155 155 AUTOB Numeric 1 156 156 PSY Numeric 1 157 157 TIMEOBS Alphanumeric 4 161 164 PCPHT Alphanumeric 2 166 167 PCTHT Alphanumeric 2 168 169 BULLETW Numeric 1 171 171 COMBBUL Numeric 1 172 172 CLIMDATA Numeric 1 173 173 RIVSTAGE Numeric 1 174 174 HYDROBUL Numeric 1 175 175 PRECDATA Numeric 1 176 176 SNOWBULL Numeric 1 177 177 NOTPUB Numeric 1 178 178 CWB Numeric 1 179 179 MONTHREV Numeric 1 180 180 STATEPUB Numeric 1 181 181 LCD Numeric 1 182 182 BQ Numeric 1 183 183 SGPD Numeric 1 184 184 WWR Numeric 1 185 185 MYB Numeric 1 186 186 OBSNAME Alphanumeric 46 188 233 NUMOBS Numeric 2 235 236 Where: X is a dummy variable used in the above SAS program to differentiate header records from data records; STANUM is the station identification number, composed of the two-digit state code followed by the four-digit Cooperative Network Index; STATE is the two-letter state abbreviation; STATUS is a single character indicating if the station is open (" " - blank) or closed ("*") DIVISION is the station division number; CURRNAME is the most current station name; COUNTY is the county in which the station is currently located; XREF is a station cross-reference, representing the cooperative network index of the station or the county name that the current station moved to or from; STANUM2 is the station identification number, composed of the two-digit state code followed by the four-digit Cooperative Network Index; MOBEG is the month the data record started (missing values are represented by 99); DAYBEG is the day the data record started (missing values are represented by 99); YRBEG is the year the data record started; MOEND is the month the data record ended (missing values are represented by 99); DAYEND is the day the data record ended (missing values are represented by 99); and YREND is the year the data record ended (missing values are represented by 9999). The next 15 variables represent suspect fields in the station history file. The values for these variables will be either 0 or 1. Values of 1 represent fields flagged as suspect by the pre-key editor. 1. SUSPLAT Latitude 2. SUSPLONG Longitude 3. SUSPLOC Previous location 4. SUSPELEV Elevation 5. SUSPPO Post office location 6. SUSPNAME Station name 7. SUSPQUAL Qualifier 8. SUSPINST Instruments 9. SUSPTIME Observation time 10. SUSPHTS Instrument heights 11. SUSPPUBS Publications 12. SUSPBEG Beginning date 13. SUSPEND Ending date 14. SUSPOBS Observer 15. SUSPOTHR Other observers LATNORTH is the current station latitude expressed in degrees and minutes north; LONGWEST is the current station longitude expressed in degrees and minutes west; DISTANCE is the distance, in tenths of miles, from the previous station location (e.g., 015 = 1.5 miles), with unknown distances represented by "999"; DISTUNIT is the unit for DISTANCE; "B" for blocks, " " for tenths of miles; DIRECT is the direction (16 point) of a station move from the previous location. The location of the temperature instrument defines the official station location. Values may be blank, character, or numeric. Unknown direction is represented by "999". Some examples of DISTANCE and DIRECT combinations are: 999 999 = first record of new station or distance and direction unknown; 015 NW = station moved 1.5 miles NW from previous location; 000 000 = no change in station (or instrument) location; 000 ESE = moved 000 999 = moved 902 ESE = temperature instrument moved 0.2 miles ESE and precipitation instrument either did not move or was moved to a location different than that of the temperature instrument; 800 000 = precipitation instrument moved instrument did not move; and 999 NW = distance unknown, direction NW; ELEV is the ground elevation at the station, expressed in whole feet above or below mean sea level; DISTPO is the distance, in tenths of miles, from the nearest post office (e.g., 015 = 1.5 miles), with unknown distances represented by "999"; POUNIT is the unit of distance from the post office, a blank " " indicates tenths of miles, "B" indicates blocks; DIRECTPO is the direction on a 16-point compass from the nearest post office. Values may be either blank, character, or numeric. Unknown directions are represented by "999". Some examples of DISTPO and DIRECTPO combinations are: 999 999 = distance and direction unknown; 015 NW = 1.5 miles NW of post office; 000 NW = 000 999 = 000 000 = at the post office. NAME is the full station name; and QUALIF is a qualifier or description that is added to the proper name of the station (e.g., Charleston 2WNW). The next 34 variables represent the following instruments and classifications. If an instrument was used at a particular station or if a particular classification is appropriate for that station, the variable will have a value of 1; if it was not used, the variable will have a value of 0. 1. ADDINST Additional instrument (wind, pressure, etc.) 2. COTTON Cotton region shelter (official, CRS) 3. DBULB Dry bulb thermometer 4. EVAPSTA Class "A" evaporation station 5. FISHPORT Fisher-Porter gage 6. HYGRO Hygrothermograph 7. MINTHERM Minimum thermometer 8. MAXTHERM Maximum thermometer 9. NORIVGAG Nonrecording river gage 10. RAINGAGE Nonstandard rain gage 11. SHELTER Nonstandard shelter 12. RECRIVER Recording river gage 13. RECRAIN Recording rain gage 14. SNOW Snow density gage 15. STORAGE Storage gage 16. STDRAIN Standard rain gage (SRG) 17. STDSHELT Standard shelter (official) 18. THERMOGR Thermograph 19. DIGTHERM Digital thermometer 20. TIPBUCK Tipping bucket gage 21. OTHEVAP Other than class "A" evaporation station 22. MAXMIN Max/min temperature system 23. TELSY Telemetry System 24. HYGRO Hygrothermometer (type unknown) 25. HY6 Hygrothermometer - H06x series 26. HY8 Hygrothermometer - H08x series 27. SFP Shielded Fischer-Porter Gage 28. SRRNG Shielded Recording Rain Gage 29. SSG Shielded Storage Gage 30. SSRG Shielded Standard Rain Gage 31. STB Shielded Tipping Bucket 32. AMOS Automated Meteorological Observing System 33. AUTOB Automated Observing Station 34. PSY Psychrometer (official, AK only) TIMEOBS are the observation times (2 characters each) for precipitation and temperature, respectively, if both times are known. Values may be either numeric (rounded to the nearest whole hour), character, or alphanumeric. Codes which relate to one or both of the times may also be present. Possible values and their meanings include the following: 0719 = precipitation amount read at 0700 LST (local standard time),temperatures read at 1900 LST; SRSS = precipitation amount read at sunrise, temperatures read at sunset; SS99 = precipitation amount read at sunset, time of temperature observations either unknown or no temperature data was available for that period of the record; 06HR = station observed 6 hours per day (not to be confused with a 6-hourly synoptic observing schedule). How these observations were used to produce daily precipitation amount and maximum/minimum temperatures is unclear; 9079 = ambiguous form; station records only gave one observation time (0700 LST), but it is unknown if this time applies to both precipitation and temperature; TRID = Tri-daily temperature observations (TAVG = [7AM + 2PM + (2 x 9PM)]/4), but time of observation for precipitation amount is unknown; and RSSS = Precipitation amounts read on a rotating schedule (SR)during crop season, i.e., April/May October/November, but SS otherwise (temperatures read at sunset); PCPHT is the height of the precipitation instrument above ground level. Values may be numeric or character, with numeric values expressed to the nearest whole foot; and PCTHT is the height of the temperature instrument above ground level. Values may be numeric or character, with numeric values expressed to the nearest whole foot. Potential values for both PCPHT and PCTHT include the following: 01-97 = actual height; 98 = 98 feet; 99 = missing; and RF = roof, actual height above ground level unknown. The next 16 variables represent the following forms of publications. If the data from a particular station appeared in a publication, the variable will have a value of 1; if not, the variable will have a value of 0. The variables and their corresponding forms of publications are as follows: 1. BULLETW Bulletin W 2. COMBBUL Combined Bulletin 3. CLIMDATA Climatological Data 4. RIVSTAGE Daily River Stages 5. HYDROBUL Hydrologic Bulletin 6. PRECDATA published as hourly precipitation data 7. SNOWBULL Snow Bulletin 8. NOTPUB not published 9. CWB Report to the chief of the U.S. Weather Bureau 10. MONTHREV Monthly Weather Review 11. STATEPUB published in state publications 12. LCD Local Climatological Data 13. BQ Bulletin Q, 1870-1903. 14. SGPD Storage Gage Precipitation Data, Western United States 15. WWR Weekly Weather Review 16. MYB U.S. Meteorological Yearbook OBSNAME is the observer's name (may include more than one name per record); NUMOBS is the number of observers participating during the time of record for an agency. HCN/D Data Files The 48 HCN/D data files (one for each state of the contiguous United States) contain daily maximum and minimum temperatures (°F), precipitation amounts (hundredths of inches), snowfall amounts (tenths of inches), snow depths (whole inches), and data flags from the 1062 HCN/D stations. The files are sorted by six-digit station number (the two-digit state code followed by the four-digit Cooperative Network Index), year, and month, with one record per month containing station number, data type, data units, year, month, number of days in the month, and 31 daily data values with their respective flags. The files may be read using the following FORTRAN format: INTEGER YEAR,MON,DAYS,VALUE(31) CHARACTER*1 SF(31),DMF(31),DQF(31) CHARACTER*4 DATTYP CHARACTER*6 STAID CHARACTER*2 UNITS 1 CONTINUE READ(5,100,END=99) STAID,DATTYP,UNITS,YEAR,MON, + DAYS,(SF(I),VALUE(I),DMF(I),DQF(I),I=1,31) 100 FORMAT(A6,1X,A4,A2,I4,I2,1X,I2,31(1X,A1,I4,2A1)) or by using the SAS format: DATA HCND; ARRAY DAY {31} $ DAY1-DAY31; INFILE IN LRECL=270; INPUT STAID $ 1-6 DATTYP $ 8-11 UNITS $ 12-13 YEAR 14-17 MON 18-19 DAYS 21-22 @23 (DAY1-DAY31) ($CHAR8.); Stated in tabular form (using variable names from the FORTRAN format), the contents of a record in an HCN/D data file include the following. Variable Variable Starting Ending Width type column column STAID Character 6 1 6 DATTYP Character 4 8 11 UNITS Character 2 12 13 YEAR Numeric 4 14 17 MON Numeric 2 18 19 DAYS Numeric 2 21 22 SF(1) Alphanumeric 1 24 24 VALUE(1) Numeric 4 25 28 DMF(1) Alphanumeric 1 29 29 DQF(1) Alphanumeric 1 30 30 SF(2-31) Alphanumeric 1 * * VALUE(2-31) Numeric 4 * * DMF(2-31) Alphanumeric 1 * * DQF(2-31) Alphanumeric 1 * * *May be obtained using: COL(N) = COL(1) + (N * 8) - 8, where COL(N) is the starting/ending column for SF(N), VALUE(N), DMF(N), or DQF(N); COL(1) is the starting/ending column for SF(1), VALUE(1), DMF(1), or DQF(1); and N is the day of the month (2-31). Where: STAID is the station identification number, composed of the two-digit state code followed by the four-digit Cooperative Network Index (defined as character to preserve leading zeros upon output); DATTYP is the data type (TMAX = maximum temperature, TMIN = minimum temperature, PRCP = precipitation amount, SNOW = snowfall amount, and SNWD = snow depth). Some stations do not always have records for all five data types in a given month; YEAR is the year of the data; MON is the month of the data; DAYS is the number of days in the month; SF(1-31) are the source flags for the daily data values; VALUE(1-31) are daily data values, with temperature in whole degrees Fahrenheit, precipitation amount in hundredths of inches, snowfall amount in tenths of inches, and snow depth in whole inches; DMF(1-31) are the data measurement flags for the daily data values; and DQF(1-31) are the data quality flags for the daily data values. Flag codes for the HCN/D data SF is a code indicating the source of the daily data value. The codes and their meanings are as follows: 0 = NCDC Tape Deck 3200, Summary of the Day Element Digital File; 3 = Manuscript-Original Records, NCDC; 4 = Climatological Data (CD) (monthly NCDC publication); 5 = Climate Record Book; as described within: History of Climatological Records Books, U.S. Department of Commerce, Weather Bureau, U.S. Government Printing Office (1960); Blank = manually estimated (see DQF flag) or missing data value. DMF is the data measurement flag, which describes how the daily value was measured. The codes and their meanings are as follows: A = accumulated amount since last measurement; B = accumulated amount includes estimated values (since last measurement); E = estimated value (see DQF flag for the particular estimation method); J = value has been manually validated; S = data value is included in a subsequent value, with the current data value being set to "0" or "-999"; T = Trace of precipitation, snowfall, or snow depth (data value set to "0" for a trace); ( = Expert System edited value; not validated; ) = Expert System approved edited value; and Blank = valid original data (no flag needed) or missing data value. Please note: other values occasionally appear as data measurement flags for which documentation is not currently available, e.g., "C" and "s". DQF is the data quality flag. In January 1982, NCDC instituted a greatly enhanced computer algorithm for automated validation of digital data archives. The system checks the internal consistency of a station's data and compares each station's observations to prescribed climatological limits and observations from surrounding stations. Numeric DQF codes apply only to NCDC's digital data, i.e., where the source flag (SF) is equal to "0" for a particular value. Alphabetic codes describe the particular manual or automated NCDC procedure employed to correct or estimate a data value. The codes and their meanings are as follows: 0 = valid data element; 1 = valid data element (from an "unknown" source, in the case of pre-1982 data); 3 = invalid data-no edited data value available; 4 = validity unknown-automated quality control procedures have not been applied; 5 = original non-numeric data value has been replaced by its deciphered numeric value; A = substituted temperature from time of observation for TMAX or TMIN; B = time-shifted value; C = precipitation estimated from snowfall; D = transposed digits; E = changed units; F = adjusted TMAX or TMIN by a multiple of plus or minus 10 deg; G = changed algebraic sign; H = moved decimal point; I = rescaling other than that of flags "F", "G", or "H"; J = subjectively derived value; K = extracted from an accumulated value; L = switched TMAX and TMIN; M = switched temperature from time of observation with TMAX or TMIN; N = substituted the mean of values taken from the three nearest cooperative weather stations; O = snow and precipitation columns were switched in station's report; P = added snowfall to snow depth; Q = switched snowfall and snow depth; R = precipitation amount was not reported, "0" has been inserted; S = manually edited value (derived using one of the procedures described by data quality flags A-R); T = data value failed internal consistency check; U = failed areal consistency check (beginning October, 1992); and Blank = valid data value (with source flag other then "0") or missing data value. 6. How to Obtain the Database and Documentation The HCN/D database is available free of charge from CDIAC. The data and a plain text version of the documentation are available from CDIAC's anonymous FTP (file transfer protocol) area via the Internet. Please note: your computer needs to have FTP software loaded on it (this is built in to most modern day operating systems). Commands used to obtain the database are shown below. For additional information, contact CDIAC. ftp cdiac.esd.ornl.gov or ftp 128.219.24.36 (When the system asks you to login, enter "anonymous") (When the system asks for your password, enter your e-mail address.) Change the directory to pub/ndp070 (i.e., "cd pub/ndp070") Retrieve the file you want (e.g., "get invent.txt") The data and an HTML version of the documentation may also be obtained from CDIAC's web site at http://cdiac.esd.ornl.gov/. For non-internet data acquisitions (e.g., 8mm tape, CD-ROM, etc.), users should contact CDIAC directly. Address: Carbon Dioxide Information Analysis Center Oak Ridge National Laboratory P.O. Box 2008 Oak Ridge, Tennessee 37831-6335, U.S.A. Telephone: (865) 574-3645 (Voice) (865) 574-2232 (Fax) Email: cdiac@ornl.gov 7. References Baker, D. G. 1975. Effect of observation time on mean temperature estimation. J. Appl. Meteor. 14:471-76. Easterling, D. R., T. R. Karl, E. H. Mason, P. Y. Hughes, and D. P. Bowman. 1996. United States Historical Climatology Network (U.S. HCN) Monthly Temperature and Precipitation Data. ORNL/CDIAC-87, NDP-019/R3. Carbon Dioxide Information Analysis Center, Oak Ridge National Laboratory, Oak Ridge, Tennessee. 280 pp. Hughes, P. Y., E. H. Mason, T. R. Karl, and W. A. Brower. 1992. United States Historical Climatology Network Daily Temperature and Precipitation Data. ORNL/CDIAC-50, NDP-042. Carbon Dioxide Information Analysis Center, Oak Ridge National Laboratory, Oak Ridge, Tennessee. 140 pp. IPPC. 2001. Climate Change 2001: The Scientific Basis. Contribution of Working Group I to the Third Assessment Report of the Intergovernmental Panel on Climate Change [Houghton, J. T., Y. Ding, D. J. Griggs, M. Noguer, P. J. van der Linden, X. Dai, K. Maskell, and C. A. Johnson (eds.)] Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, 881 pp. Jones, P. D. 1994. Northern Hemisphere surface air temperature variations : a reanalysis and an update to 1993. J. Climate 7:2548-2568. Jones, P. D., S. C. B. Raper, R. S. Bradley, H. F. Diaz, P. M. Kelly, and T. M. L. Wigley. 1986. Northern Hemisphere surface air temperature variations 1851 1984. J. Clim. Appl. Meteor. 25:161-79. Jones, P. D., T. J. Osborn, and K. R. Briffa. 1997. Estimating sampling errors in large-scale temperature averages. J. Climate 10:1794-1802. Karl, T. R., G. Kukla, and J. Gavin. 1986. Relationship between decreased temperature range and precipitation trends in the United States and Canada, 1941 80. J. Clim. Appl. Meteor. 25:1878-86. Karl, T. R., and C. N. Williams, Jr. 1987. An approach to adjusting climatological time series for discontinuous inhomogeneities. J. Clim. Appl. Meteor. 26:1744-63. Karl, T. R., C. N. Williams, Jr., and F. T. Quinlan. 1990. United States Historical Climatology Network (HCN) Serial Temperature and Precipitation Data. ORNL/CDIAC- 30, NDP-019/R1. Carbon Dioxide Information Analysis Center, Oak Ridge National Laboratory, Oak Ridge, Tennessee. Mitchell, J. M., Jr. 1958. Effects of changing observation time on mean temperature. Bull. Amer. Meteor. Soc. 39:83-89. Peterson, T. C., and R. S. Vose. 1997. An Overview of the Global Historical Climatology Network Temperature Database. Bull. Amer. Meteor. Soc. 78:2837-49. Quayle, R. G., D. R. Easterling, T. R. Karl, and P. J. Hughes. 1991. Effects of recent thermometer changes in the cooperative station network. Bull. Amer. Meteor. Soc. 72:1718-23. Quinlan, F. T., T. R. Karl, and C. N. Williams, Jr. 1987. United States Historical Climatology Network (HCN) serial temperature and precipitation data. NDP-019. Carbon Dioxide Information Analysis Center. Oak Ridge National Laboratory, Oak Ridge, Tennessee. Schaal, L. A. and R. F. Dale. 1977. Time of observation temperature bias and "climatic change". J. Appl. Meteor. 16:215-22. Vinnikov, K. Ya., P. Ya. Groisman, and K. M. Lugina. 1990. Empirical data on contemporary global climate changes (temperature and precipitation). J. Clim. 3:662-67. Williams, C. N., R. S. Vose, D. R. Easterling, and M. J. Menne, 2004. United States Historical Climatology Network Daily Temperature, Precipitation, and Snow Data. ORNL/CDIAC-118, NDP-070. Carbon Dioxide Information Analysis Center, Oak Ridge National Laboratory, Oak Ridge, Tennessee. APPENDIX B STATE NUMBERS AND ABBREVIATIONS USED FOR THE 48 STATES IN THE HCN/D DATABASE 01 AL Alabama 02 AZ Arizona 03 AR Arkansas 04 CA California 05 CO Colorado 06 CT Connecticut 07 DE Delaware 08 FL Florida 09 GA Georgia 10 ID Idaho 11 IL Illinois 12 IN Indiana 13 IA Iowa 14 KS Kansas 15 KY Kentucky 16 LA Louisiana 17 ME Maine 18 MD Maryland 19 MA Massachusetts 20 MI Michigan 21 MN Minnesota 22 MS Mississippi 23 MO Missouri 24 MT Montana 25 NE Nebraska 26 NV Nevada 27 NH New Hampshire 28 NJ New Jersey 29 NM New Mexico 30 NY New York 31 NC North Carolina 32 ND North Dakota 33 OH Ohio 34 OK Oklahoma 35 OR Oregon 36 PA Pennsylvania 37 RI Rhode Island 38 SC South Carolina 39 SD South Dakota 40 TN Tennessee 41 TX Texas 42 UT Utah 43 VT Vermont 44 VA Virginia 45 WA Washington 46 WV West Virginia 47 WI Wisconsin 48 WY Wyoming CITE AS: Williams, C. N., R. S. Vose, D. R. Easterling, and M. J. Menne, 2006. United States Historical Climatology Network Daily Temperature, Precipitation, and Snow Data. ORNL/CDIAC-118, NDP-070. Carbon Dioxide Information Analysis Center, Oak Ridge National Laboratory, Oak Ridge, Tennessee.