Introduction of data sources and availability

The data used in this package were original compiled and processed by United States Geographic Services (USGS). The fertilizer data include the application in both farms and non-farms for 1945 through 2012. The folks in USGS utilized the sales data of commercial fertilizer each state or county from the Association of American Plant Food Control Officials (AAPFCO) commercial fertilizer sales data. State estimates were then allocated to the county-level using fertilizer expenditure from the Census of Agriculture as county weights for farm fertilizer, and effective population density as county weights for nonfarm fertilizer. The data sources and other further information are availalbe in Table 1.

Dataset name Temporal coverage Source Website Comments
Fertilizer data before 1985 1945 - 1985 USGS Link Only has farm data.
Fertilizer data after 1986 1986 - 2012 USGS Link Published in 2017.
County background data 2010 US Census Link Assume descriptors of counties do not change.
Manure data before 1997 1982 - 1997 USGS link Manual data into farm every five years
Manure data in 2002 2002 USGS link Published in 2013
Manure data in 2007 and 2012 2007 & 2012 USGS link Published in 2017

Data cleanning and processing

As the county-level fertilizer data were processed at different times and by different researchers, the format of the data are a little bit messy. For the sake of time and efforts to employ a complicated dataset, the author cleaned the data into a Tidy Data following these rules from Hadley Wickham:

    1. Each variable must have its own column.
    1. Each observation must have its own row.
    1. Each value must have its own cell.

Fig. 1 shows the rules visually.

Fig. 1 Following three rules makes a dataset tidy: variables are in columns, observations are in rows, and values are in cells.

(The description of tidy data was adapted from R for data science)

Data cleanning

Future development plan

There are some future features in the dataset, including:

  • Add missing data in the year of 1986.
  • Develop a package to retrieve, analyze and visualize the fertilizer data in watersheds.