Generate a reproducible map for county-level fertilizer estimation data in U.S.A. using R

Introduction and motivation

Nutrient input to agricultural watersheds is a very popular topic among researchers, engineers and stakeholders. Researchers in United State Geographic Services (USGS) spent a considerable amount of time and efforts to generate fertilizr estimation dataset from synthetic fertilizer and manure. Based on the dataset published by USGS, the author developed an R package, ggfertilizer, to retrieve, summarize and visualize fertilizer data in contiguous U.S.A.

In this post, the author is going to briefly introduce the basics of ggfertilizer and provide a smooth and clear workflow to generate a reproducible fertilizer usage map. This blog targets users with an entry level of R or related packages. For advanced R users, there is a more detailed description available at the package website.

Why reproducible?

Generally, published research should be able to be reproduced by peers with same prerequisite. However, according to a report from Nature in 2016, 1,500 scientists lift the lid on reproducibility :

More than 70% of researchers have tried and failed to reproduce another scientist’s experiments, and more than half have failed to reproduce their own experiments.

This so-called reproducibility crisis has already raised great concerns from different perspectives, including researchers, funding agencies, stakeholders, and also the public. To increase the credits and values of research output, it is of importance to provide reproducible research output coupled with data, results, visualization, and relevant codes.

How to generate a reproducible map?

Prerequisite

All the materials in this post were generated using R, an open-source software. Users can easily download and install the corresponding R version with your operation system via this link.

There are also some packages required to reproduce this post. If you have not installed them, please run the following codes.

install.packages("ggplot2")
install.packages("usfertilizer")
install.packages("ggsn")

# check if devtools installed.
if(!require(devtools, character.only = TRUE)){
  install.packages("devtools")
}

# install packages from my github repo.
devtools::install_github("wenlong-liu/ggfertilizer")

After installing all the libraries, we should include them in the R session to run the following codes.

require(ggfertilizer)
require(ggplot2)
require(ggsn)
# import pre-packed dataset
data("us_fertilizer_county")

Finalize parameters

As mentioned before, the author has already wrapped and released the fertilizer data for contiguous United States at a county level from 1945 to 2012. Additional details of data sources, compilation and coverage are available via usfertilizer. First let us look at the description of the dataset.

str(us_fertilizer_county)
## Classes 'tbl_df', 'tbl' and 'data.frame':    625580 obs. of  12 variables:
##  $ FIPS      : chr  "01001" "01003" "01005" "01007" ...
##  $ State     : chr  "AL" "AL" "AL" "AL" ...
##  $ County    : chr  "Autauga" "Baldwin" "Barbour" "Bibb" ...
##  $ ALAND     : num  1.54e+09 4.12e+09 2.29e+09 1.61e+09 1.67e+09 ...
##  $ AWATER    : num  2.58e+07 1.13e+09 5.09e+07 9.29e+06 1.52e+07 ...
##  $ INTPTLAT  : num  32.5 30.7 31.9 33 34 ...
##  $ INTPTLONG : num  -86.6 -87.7 -85.4 -87.1 -86.6 ...
##  $ Quantity  : num  1580225 6524369 2412372 304592 1825118 ...
##  $ Year      : chr  "1987" "1987" "1987" "1987" ...
##  $ Nutrient  : chr  "N" "N" "N" "N" ...
##  $ Farm.Type : chr  "farm" "farm" "farm" "farm" ...
##  $ Input.Type: chr  "Fertilizer" "Fertilizer" "Fertilizer" "Fertilizer" ...

The full dataset contains 625,580 observations (rows) and 12 variables (columns). With the R package ggfertilizer, users only need to specify the parameters of interest, including year, nutrient, types of farms, input sources, etc. For instance, the author finalizes a list of parameter for plotting fertilizer map. We can generate maps in the following sections.

Year <-  2001
Nutrient <- "N"
Input_Type <- "fertilizer" # nutrient comes from synthetic fertilizer.
Farm_Type <- "farm" # nutrient applied to farms.

Draw a base map

The ggfertilizer package includes a function map_us_fertilizer() to draw maps easily. Next step will be feed the finalized parameters into the plotting function.

# draw the map
us_plot <- map_us_fertilizer(data = us_fertilizer_county, Year = Year, Nutrient = Nutrient,
                             Farm_Type = Farm_Type, Input_Type = Input_Type, 
                             add_north = TRUE) # add_north will be used in further sections.
us_plot

Add title

The map actually is a ggplot2 object and users can modified most of the components using ggplot2 grammar. For example, we can add a title composted by the input parameters.

map_title <- paste(Nutrient,  " from ", Input_Type, " input to ", Farm_Type, " in the year of ",Year,
                     " \nat a county level",sep = "")
# add the title.
us_plot <- us_plot +
      ggtitle(map_title)
us_plot

Add north and scale bars

According to practical guidelines of Geographic Information System (GIS), a map without north symbols and scale bars is not a map. Therefore we can add them to the current map.

# add north symbol and scale bar.
us_plot <- us_plot +
  north(us_plot$states_shape, scale = 0.15, anchor = c(x = -68, y = 50) ) +
  scalebar(us_plot$states_shape, dist = 500, dd2km = TRUE, model = 'WGS84', st.size = 2)

us_plot

Save a map

After adding all the components, we can save the map for further purposes. The map can be save as different format, such as jpg, pdf, svg or png. In this post, the author will save it as a jpg picture.

ggsave(filename = "us_fertilizer_map_2001.jpg", width = 6, height = 4, scale = 1.5, units = "in")

Summaries

In this post, the author briefly showed how to generate a reproducible fertilizer map using R. All the codes and related materials are available via my Github repo. If you have any questions or comments, please feel free to leave a comment or open an issue in Github.

As ggfertilizer is still under heavy development, the author is still working on testing existing functions and adding more features. In the future, the ggfertilizer will be submitted to CRAN, so that R users are more convenient to work with this package.

R session

sessionInfo()
## R version 3.5.0 (2018-04-23)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS  10.14.6
## 
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] bindrcpp_0.2.2     maps_3.3.0         ggsn_0.4.0        
## [4] ggplot2_3.1.0      ggfertilizer_0.0.4 usfertilizer_0.1.5
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.0        pillar_1.3.1      compiler_3.5.0   
##  [4] plyr_1.8.4        bindr_0.1.1       viridis_0.5.1    
##  [7] tools_3.5.0       digest_0.6.18     lattice_0.20-35  
## [10] viridisLite_0.3.0 evaluate_0.10.1   tibble_2.0.1     
## [13] gtable_0.2.0      png_0.1-7         pkgconfig_2.0.2  
## [16] rlang_0.3.1       mapproj_1.2.6     yaml_2.1.19      
## [19] blogdown_0.6      xfun_0.2          gridExtra_2.3    
## [22] withr_2.1.2       dplyr_0.7.8       stringr_1.3.1    
## [25] knitr_1.20        rprojroot_1.3-2   grid_3.5.0       
## [28] tidyselect_0.2.5  glue_1.3.0        R6_2.3.0         
## [31] foreign_0.8-70    rmarkdown_1.10    bookdown_0.7     
## [34] sp_1.3-1          purrr_0.3.0       magrittr_1.5     
## [37] maptools_0.9-2    backports_1.1.2   scales_1.0.0     
## [40] htmltools_0.3.6   assertthat_0.2.0  colorspace_1.3-2 
## [43] labeling_0.3      stringi_1.2.4     lazyeval_0.2.1   
## [46] munsell_0.5.0     crayon_1.3.4

Related

comments powered by Disqus