TITLE OF THE DATA SET United States Historical Climatology Network Daily Temperature and Precipitation Data DATA CONTRIBUTORS P. Y. Hughes, E. H. Mason, T. R. Karl, and W. A. Brower National Climatic Data Center National Oceanic and Atmospheric Administration Asheville, North Carolina 28801 SOURCE AND SCOPE OF THE DATA The 138 station HCN/D data base contains station histories, daily maximum and minimum temperatures, and daily precipitation amounts that were compiled by NCDC after being extracted from digital and non-digital data sets archived at NCDC. These data sets come from a variety of sources, including climatological publications, universities, federal agencies, individuals, and data archives. The HCN/D stations were selected from the HCN so as to provide a reasonably homogeneous spatial distribution of stations within the contiguous United States after consideration of the following. 1. The degree to which each station maintained a constant observation time for maximum and minimum temperatures. This criterion was given the greatest weight, excursions from a station's predominant observing time of no more than 4 years being desired. (For 24 (17%) of the 138 stations this condition is not met; 15 of the 24 stations changed their observing time either from PM to midnight (MD) or MD to PM, and 9 stations changed either from AM to PM or PM to AM.) 2. At least 95% of a station's pre-1951 data should be contained in NCDC digital daily archives. 3. A station's potential for heat island bias over time should be low. 4. Quality assessments based upon the decile ranking assigned by Karl et al. (1990) to the stations' monthly maximum/minimum temperature data for the quality characteristics numbered 5-8 in Appendix A of the documentation that accompanies this data base. A station should rank within the lower 9 deciles for each characteristic. DATA FORMAT The information in this subdirectory (NDP042) is arranged in fourteen files, containing the following: > this documentation file (NDP042.TXT) > a FORTRAN IV Input/Output (I/O) routine for the HCN/D station inventory file (RETRIEVL.FO1) > a FORTRAN IV I/O routine for the HCN/D station history file (RETRIEVL.FO2) > a FORTRAN IV I/O routine for the HCN/D data files (RETRIEVL.FO3) > a SAS* I/O routine for the HCN/D station inventory file (RETRIEVL.SA1) > a SAS I/O routine for the HCN/D station history file (RETRIEVL.SA2) > a SAS I/O routine for the HCN/D data files (RETRIEVL.SA3) > the HCN/D station inventory file (STATION.INV) > the HCN/D station history file (STATION.HIS) > five files containing HCN/D temperature and precipitation data (HCNDALID.DAT, HCNDILMI.DAT, HCNDMNNY.DAT, HCNDNCTN.DAT, and HCNDTXWY.DAT) The format and contents of each data file are described in the following. STATION INVENTORY FILE FOR THE HCN/D DATA SET The station inventory file for the HCN/D data set (STATION.INV) is sorted by two-digit state code and four-digit Cooperative Network Index, with one record per station containing state code, Cooperative Network Index, state abbreviation, station name, beginning month and year of data, time of observation, latitude, and longitude. The file may be read using the following FORTRAN format: INTEGER BYEAR,LATDEG,LATMIN,LONDEG,LONMIN CHARACTER*2 STCODE,STATE,BMON CHARACTER*4 CNI CHARACTER*23 STNAME CHARACTER*5 TOBS READ(5,100,END=99)STCODE,CNI,STATE,STNAME,BMON,BYEAR,TOBS, + LATDEG,LATMIN,LONDEG,LONMIN 100 FORMAT(1X,A2,2X,A4,2X,A2,2X,A23,2X,A2,2X,I4,2X,A5, + 2X,I2,2X,I2,2X,I3,2X,I2) or by using the SAS format: DATA INVENT; LENGTH STNAME $ 23; INFILE IN; INPUT STCODE $ 2-3 CNI $ 6-9 STATE $ 12-13 STNAME 16-38 BMON $ 41-42 BYEAR $ 45-48 TOBS $ 51-55 LATDEG 58-59 LATMIN 62-63 LONDEG 66-68 LONMIN 71-72; *SAS is a registered trademark of SAS Institute, Inc., Cary, North Carolina 27511-8000. Stated in tabular form, the contents of the station inventory file include the following. Variable Variable Starting Ending Variable type width column column STCODE Character 2 2 3 CNI Character 4 6 9 STATE Character 2 12 13 STNAME Character 23 16 38 BMON Character 2 41 42 BYEAR Numeric 4 45 48 TOBS Character 5 51 55 LATDEG Numeric 2 58 59 LATMIN Numeric 2 62 63 LONDEG Numeric 3 66 68 LONMIN Numeric 2 71 72 where STCODE is the two-digit state code (01-48), defined as character to allow for preserving leading zeros upon output; CNI is the four-digit Cooperative Network Index, defined as character above to allow for preserving leading zeros upon output; STATE is the two-letter state abbreviation; STNAME is the station name; BMON is the beginning month of the daily maximum/minimum temperature record for a station; BYEAR is the beginning year of the daily maximum/minimum temperature record for a station. Precipitation data may begin in a different year; TOBS is the time of observation; the predominant time at which temperature readings are historically taken at the site: morning (AM), evening (PM), or midnight (MD). Combinations of these codes indicate sites at which the excursion from a constant TOBS exceeded the 4-year limit imposed by the selection criteria; LATDEG is the degrees (north) portion of the station's latitude; LATMIN is the minutes portion of the station's latitude; LONDEG is the degrees (west) portion of the station's longitude; and LONMIN is the minutes portion of the station's longitude. STATION HISTORY FILE The station history file (STATION.HIS) provides valuable information concerning each station in the HCN/D. This file documents station moves and instrument changes, lists station observers and observation times, and identifies suspect fields. The file may be read using the following FORTRAN format: DIMENSION DATA(54) READ(5,100,END=99) (DATA(I),I=1,54) 100 FORMAT(54A4) or using the SAS format: DATA HISTORY (DROP=X); RETAIN STANUM STATE DIVISION CURRNAME COUNTY XREF; INFILE IN MISSOVER LS=216; INPUT @45 X $1. @; IF X NE ' ' THEN DO; INPUT STANUM 1-6 STATE $ 8-9 DIVISION 11-12 CURRNAME $ 14-43 COUNTY $ 45-60 XREF $ 62-86; END; ELSE INPUT STANUM2 1-6 MOBEG 8-9 DAYBEG 11-12 YRBEG 14-17 MOEND 19-20 DAYEND 22-23 YREND 25-28 SUSPLAT 30 SUSPLONG 31 SUSPLOC 32 SUSPELEV 33 SUSPPO 34 SUSPNAME 35 SUSPQUAL 36 SUSPINST 37 SUSPTIME 38 SUSPHTS 39 SUSPPUBS 40 SUSPBEG 41 SUSPEND 42 SUSPOBS 43 SUSPOTHR 44 LATNORTH $ 46-51 LONGWEST $ 53-59 DISTANCE 61-63 DIRECT $ 65-67 ELEV 69-73 DISTPO 75-77 DIRECTPO $ 79-81 NAME $ 83-110 QUALIF $ 112-121 ADDINST 123 COTTON 124 DBULB 125 EVAPSTA 126 FISHPORT 127 HYGRO 128 MINTHERM 129 MAXTHERM 130 NORIVGAG 131 RAINGAGE 132 SHELTER 133 RECRIVER 134 RECRAIN 135 SNOW 136 STORAGE 137 STDRAIN 138 STDSHELT 139 THERMOGR 140 DIGTHERM 141 TIPBUCK 142 OTHEVAP 143 MAXMIN 144 TIMEOBS $ 146-149 PCPHT $ 151-152 PCTHT $ 154-155 BULLETW 157 COMBBUL 158 CLIMDATA 159 RIVSTAGE 160 HYDROBUL 161 PRECDATA 162 SNOWBULL 163 NOTPUB 164 CWB 165 MONTHREV 166 STATEPUB 167 LCD 168 BQ 169 SGPD 170 WWR 171 MYB 172 OBSNAME $ 174-213 NUMOBS 215-216; Stated in tabular form, the contents of the station history file include the following. Variable Variable Starting Ending Variable type width column column X Alphanumeric 1 45 45 STANUM Numeric 6 1 6 STATE Character 2 8 9 DIVISION Numeric 2 11 12 CURRNAME Alphanumeric 30 14 43 COUNTY Alphanumeric 16 45 60 XREF Alphanumeric 25 62 86 STANUM2 Numeric 6 1 6 MOBEG Numeric 2 8 9 DAYBEG Numeric 2 11 12 YRBEG Numeric 4 14 17 MOEND Numeric 2 19 20 DAYEND Numeric 2 22 23 YREND Numeric 4 25 28 SUSPLAT Numeric 1 30 30 SUSPLONG Numeric 1 31 31 SUSPLOC Numeric 1 32 32 SUSPELEV Numeric 1 33 33 SUSPPO Numeric 1 34 34 SUSPNAME Numeric 1 35 35 SUSPQUAL Numeric 1 36 36 SUSPINST Numeric 1 37 37 SUSPTIME Numeric 1 38 38 SUSPHTS Numeric 1 39 39 SUSPPUBS Numeric 1 40 40 SUSPBEG Numeric 1 41 41 SUSPEND Numeric 1 42 42 SUSPOBS Numeric 1 43 43 SUSPOTHR Numeric 1 44 44 LATNORTH Alphanumeric 6 46 51 LONGWEST Alphanumeric 7 53 59 DISTANCE Numeric 3 61 63 DIRECT Alphanumeric 3 65 67 ELEV Numeric 5 69 73 DISTPO Numeric 3 75 77 DIRECTPO Alphanumeric 3 79 81 NAME Character 28 83 110 QUALIF Alphanumeric 10 112 121 ADDINST Numeric 1 123 123 COTTON Numeric 1 124 124 DBULB Numeric 1 125 125 EVAPSTA Numeric 1 126 126 FISHPORT Numeric 1 127 127 HYGRO Numeric 1 128 128 MINTHERM Numeric 1 129 129 MAXTHERM Numeric 1 130 130 NORIVGAG Numeric 1 131 131 RAINGAGE Numeric 1 132 132 SHELTER Numeric 1 133 133 RECRIVER Numeric 1 134 134 RECRAIN Numeric 1 135 135 SNOW Numeric 1 136 136 STORAGE Numeric 1 137 137 STDRAIN Numeric 1 138 138 STDSHELT Numeric 1 139 139 THERMOGR Numeric 1 140 140 DIGTHERM Numeric 1 141 141 TIPBUCK Numeric 1 142 142 OTHEVAP Numeric 1 143 143 MAXMIN Numeric 1 144 144 TIMEOBS Alphanumeric 4 146 149 PCPHT Alphanumeric 2 151 152 PCTHT Alphanumeric 2 154 155 BULLETW Numeric 1 157 157 COMBBUL Numeric 1 158 158 CLIMDATA Numeric 1 159 159 RIVSTAGE Numeric 1 160 160 HYDROBUL Numeric 1 161 161 PRECDATA Numeric 1 162 162 SNOWBULL Numeric 1 163 163 NOTPUB Numeric 1 164 164 CWB Numeric 1 165 165 MONTHREV Numeric 1 166 166 STATEPUB Numeric 1 167 167 LCD Numeric 1 168 168 BQ Numeric 1 169 169 SGPD Numeric 1 170 170 WWR Numeric 1 171 171 MYB Numeric 1 172 172 OBSNAME Alphanumeric 40 174 213 NUMOBS Numeric 2 215 216 where X is a dummy variable used in the above SAS program to differentiate header records from data records; STANUM is the station identification number, composed of the two-digit state code followed by the four-digit Cooperative Network Index; STATE is the two-letter state abbreviation; DIVISION is the station division number; CURRNAME is the most current station name; COUNTY is the county in which the station is currently located; XREF is a station cross-reference, representing the cooperative network index of the station or the county name that the current station moved to or from; STANUM2 is the station identification number, composed of the two-digit state code followed by the four-digit Cooperative Network Index; MOBEG is the month the data record started (missing values are represented by 99); DAYBEG is the day the data record started (missing values are represented by 99); YRBEG is the year the data record started; MOEND is the month the data record ended (missing values are represented by 99); DAYEND is the day the data record ended (missing values are represented by 99); and YREND is the year the data record ended (missing values are represented by 9999). The next 15 variables represent suspect fields in the station history file. The values for these variables will be either 0 or 1. Values of 1 represent fields flagged as suspect by the pre-key editor. 1. SUSPLAT Latitude 2. SUSPLONG Longitude 3. SUSPLOC Previous location 4. SUSPELEV Elevation 5. SUSPPO Post office location 6. SUSPNAME Station name 7. SUSPQUAL Qualifier 8. SUSPINST Instruments 9. SUSPTIME Observation time 10. SUSPHTS Instrument heights 11. SUSPPUBS Publications 12. SUSPBEG Beginning date 13. SUSPEND Ending date 14. SUSPOBS Observer 15. SUSPOTHR Other observers LATNORTH is the current station latitude expressed in degrees and minutes north; LONGWEST is the current station longitude expressed in degrees and minutes west; DISTANCE is the distance, in tenths of miles, from the previous station location (e.g., 015 = 1.5 miles), with unknown distances represented by 999; DIRECT is the direction (16 point) of a station move from the previous location. The location of the temperature instrument defines the official station location. Values may be blank, character, or numeric. Unknown direction is represented by 999. Some examples of DISTANCE and DIRECT combinations are: 999 999 = first record of new station or distance and direction unknown; 015 NW = station moved 1.5 miles NW from previous location; 000 000 = no change in station (or instrument) location; 000 ESE = moved <0.1 mile east-southeast (ESE) from previous location; 000 999 = moved <0.1 mile, direction unknown; 902 ESE = temperature instrument moved 0.2 miles ESE and precipitation instrument either did not move or was moved to a location different than that of the temperature instrument; 800 000 = precipitation instrument moved <0.1 mile, but the temperature instrument did not move; and 999 NW = distance unknown, direction NW; ELEV is the ground elevation at the station, expressed in whole feet above or below mean sea level; DISTPO is the distance, in tenths of miles, from the nearest post office (e.g. 015 = 1.5 miles), with unknown distances represented by 999; DIRECTPO is the direction on a 16-point compass from the nearest post office. Values may be either blank, character, or numeric. Unknown directions are represented by 999. Some examples of DISTPO and DIRECTPO combinations are: 999 999 = distance and direction unknown; 015 NW = 1.5 miles NW of post office; 000 NW = <0.1 mile NW from post office; 000 999 = <0.1 mile from post office, direction unknown; and 000 000 = at the post office. NAME is the full station name; and QUALIF is a qualifier or description that is added to the proper name of the station (e.g., Charleston 2WNW). The next 22 variables represent the following instruments and classifications. If an instrument was used at a particular station or if a particular classification is appropriate for that station, the variable will have a value of 1; if it was not used, the variable will have a value of 0. 1. ADDINST Additional instrument (wind, pressure, etc.) 2. COTTON Cotton region shelter (official, CRS) 3. DBULB Dry bulb thermometer 4. EVAPSTA Class "A" evaporation station 5. FISHPORT Fisher-Porter gage 6. HYGRO Hygrothermograph 7. MINTHERM Minimum thermometer 8. MAXTHERM Maximum thermometer 9. NORIVGAG Nonrecording river gage 10. RAINGAGE Nonstandard rain gage 11. SHELTER Nonstandard shelter 12. RECRIVER Recording river gage 13. RECRAIN Recording rain gage 14. SNOW Snow density gage 15. STORAGE Storage gage 16. STDRAIN Standard rain gage (SRG) 17. STDSHELT Standard shelter (official) 18. THERMOGR Thermograph 19. DIGTHERM Digital thermometer 20. TIPBUCK Tipping bucket gage 21. OTHEVAP Other than class "A" evaporation station 22. MAXMIN Max/min temperature system TIMEOBS are the observation times (2 characters each) for precipitation and temperature, respectively, if both times are known. Values may be either numeric (rounded to the nearest whole hour), character, or alphanumeric. Codes which relate to one or both of the times may also be present. Possible values and their meanings include the following: 0719 = precipitation amount read at 0700 LST (local standard time), temperatures read at 1900 LST; SRSS = precipitation amount read at sunrise, temperatures read at sunset; SS99 = precipitation amount read at sunset, time of temperature observations either unknown or no temperature data was available for that period of the record; 06HR = station observed 6 hours per day (not to be confused with a 6-hourly synoptic observing schedule). How these observations were used to produce precpitation amounts and maximum/minimum temperatures is unclear; 9079 = ambiguous form; station records only gave one observation time (0700 LST), but it is unknown if this time applies to both precipitation and temperature; TRID = Tri-daily temperature observations (TAVG = [7AM + 2PM + (2 x 9PM)]/4), but time of observation for precipitation amount is unknown; and RSSS = Precipitation amounts read on a rotating schedule (SR during crop season, i.e., April/May-October/November, but SS otherwise), temperatures read at sunset; PCPHT is the height of the precipitation instrument above ground level. Values may be numeric or character, with numeric values expressed to the nearest whole foot; and PCTHT is the height of the temperature instrument above ground level. Values may be numeric or character, with numeric values expressed to the nearest whole foot. Potential values for both PCPHT and PCTHT include the following: 01-97 = actual height; 98 = >98 feet; 99 = missing; and RF = roof, actual height above ground level unknown. The next 16 variables represent the following forms of publications. If the data from a particular station appeared in a publication, the variable will have a value of 1; if not, the variable will have a value of 0. The variables and their corresponding forms of publications are as follows: 1. BULLETW Bulletin W 2. COMBBUL Combined Bulletin 3. CLIMDATA Climatological Data 4. RIVSTAGE Daily River Stages 5. HYDROBUL Hydrologic Bulletin 6. PRECDATA published as hourly precipitation data 7. SNOWBULL Snow Bulletin 8. NOTPUB not published 9. CWB Report to the chief of the U.S. Weather Bureau 10. MONTHREV Monthly Weather Review 11. STATEPUB published in state publications 12. LCD Local Climatological Data 13. BQ Bulletin Q , 1870-1903. 14. SGPD Storage Gage Precipitation Data, Western United States 15. WWR Weekly Weather Review 16. MYB U.S. Meteorological Yearbook OBSNAME is the observer's name (may include more than one name per record); NUMOBS is the number of observers participating during the time of record for an agency. HCN/D DATA FILES The HCN/D data files (HCNDALID.DAT, HCNDILMI.DAT, HCNDMNNY.DAT, HCNDNCTN.DAT, & HCNDTXWY.DAT) contain daily maximum and minimum temperatures (degrees F), precipitation amounts (hundredths of inches), and data flags from the 138 HCN/D stations. The files are sorted by six-digit station number (the two-digit state code followed by the four-digit Cooperative Network Index), year, and month, with one record per month containing station number, data type, year, month, number of days in the month, and 31 daily data values with their respective flags. The data are divided among the five files according to state code (see Sect. 11. of the accompanying documentation for an exact breakdown). The files may be read using the following FORTRAN format: INTEGER YEAR,MON,DAYS,VALUE(31) CHARACTER*1 SF(31),DMF(31),DQF(31) CHARACTER*2 STCODE CHARACTER*4 DATTYP CHARACTER*6 STAID NREC=0 1 CONTINUE READ(5,100,END=99) STAID,DATTYP,YEAR,MON,DAYS, + (SF(I),VALUE(I),DMF(I),DQF(I),I=1,31) 100 FORMAT(1X,A6,1X,A4,1X,I4,I2,1X,I2,31(1X,A1,I4,2A1)) or by using the SAS format: DATA HCND; ARRAY DAY {31} $ DAY1-DAY31; INFILE IN LRECL=270; INPUT @2 STAID $ 2-7 DATTYP $ 9-12 YEAR 14-17 MON 18-19 DAYS 21-22 @23 (DAY1-DAY31) ($CHAR8.); (The respective flag and data values contained in each of the 31 elements of the array DAY in the SAS format may be extracted using the SAS code contained in RETRIEVL.SA3 in this subdirectory and also printed in Sect. 14.) Stated in tabular form (using variable names from the FORTRAN format), the contents of an HCN/D data file include the following. Variable Variable Starting Ending Variable type width column column STAID Character 6 2 7 DATTYP Character 4 9 12 YEAR Numeric 4 14 17 MON Numeric 2 18 19 DAYS Numeric 2 21 22 SF(1) Alphanumeric 1 24 24 VALUE(1) Numeric 4 25 28 DMF(1) Alphanumeric 1 29 29 DQF(1) Alphanumeric 1 30 30 SF(2-31) Alphanumeric 1 * * VALUE(2-31) Numeric 4 * * DMF(2-31) Alphanumeric 1 * * DQF(2-31) Alphanumeric 1 * * *May be obtained using: COL(N) = COL(1) + (N * 8) - 8, where COL(N) is the starting/ending column for SF(N), VALUE(N), DMF(N), or DQF(N); COL(1) is the starting/ending column for SF(1), VALUE(1), DMF(1), or DQF(1); and N is the day of the month (2-31). where STAID is the station identification number, composed of the two-digit state code followed by the four-digit Cooperative Network Index (defined as character to preserve leading zeros upon output); DATTYP is the data type (TMAX = maximum temperature, TMIN = minimum temperature, and PRCP = precipitation amount). Some stations do not always have records for all three data types in a given month; YEAR is the year of the data; MON is the month of the data; DAYS is the number of days in the month; SF(1-31) are the source flags for the daily data values; VALUE(1-31) are daily data values, with temperatures in whole degrees Fahrenheit and precipitation amounts in hundredths of inches; DMF(1-31) are the data measurement flags for the daily data values; and DQF(1-31) are the data quality flags for the daily data values. Flag codes for the HCN/D data SF is a code indicating the source of the daily data value. The codes and their meanings are as follows: 0 = NCDC Tape Deck 3200, Summary of the Day Element Digital File; 3 = Manuscript - Original Records, NCDC; 4 = Climatological Data (CD) (monthly NCDC publication); 5 = Climate Record Book; as described within: History of Climatological Records Books, U.S. Department of Commerce, Weather Bureau, U.S. Government Printing Office (1960); Blank = manually estimated (see DQF flag) or missing data value. DMF is the data measurement flag, which describes how the daily value was measured. The codes and their meanings are as follows: A = amount of accumulated precipitation since last measurement; B = amount of accumulated precipitation since last measurement (includes estimated values); E = manual or automated estimated value (see DQF flag for the particular estimation procedure); I = value determined by spatial interpolation using data from surrounding HCN stations; S = data value is included in a subsequent value; T = Trace of precipitation (data value should equal 0 for a trace); and Blank = valid original data (no flag needed) or missing data value. DQF is the data quality flag. In January 1982, NCDC instituted a greatly enhanced computer algorithm for automated validation of digital data archives. The system checks the internal consistency of a station's data and compares each station's observations to prescribed climatological limits and observations from surrounding stations. Numeric DQF codes apply only to NCDC's digital data, i.e., where the source flag (SF) is equal to "0" for a particular value. Alphabetic codes describe the particular manual or automated NCDC procedure employed to correct or estimate a data value. The codes and their meanings are as follows: 0 = valid data; 1 = valid data (Pre-1982 quality control methods were employed, with only a gross check of the magnitude of the value.); 3 = invalid data - no edited data value available; 4 = validity unknown - automated quality control procedures have not been applied; A = substituted temperature from time of observation for TMAX or TMIN; B = time-shifted value; F = adjusted TMAX or TMIN by a multiple of plus or minus 10 degrees; L = switched TMAX and TMIN; M = switched temperature from time of observation with TMAX or TMIN; N = substituted the mean of values taken from the three nearest cooperative weather stations; O = snow and precipitation columns were switched in station's report; R = precipitation amount was not reported, "0" has been inserted; S = manually edited value (derived using one of the procedures described by data quality flags A-R); T = data value failed internal consistency check; and Blank = valid data value with source flag other then "0" or missing data value. REFERENCE Karl, T. R., C. N. Williams, Jr., and F. T. Quinlan. 1990. United States Historical Climatology Network (HCN) serial temperature and precipitation data. ORNL/CDIAC-30, NDP-019/R1. Carbon Dioxide Information Analysis Center, Oak Ridge National Laboratory, Oak Ridge, Tenn.