--- title: "Functions and Data Sources that Depend on SoilWeb" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{SoilWeb Functions and Data} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set( echo = TRUE, eval = !as.logical(Sys.getenv("R_SOILDB_SKIP_LONG_EXAMPLES", unset = "TRUE")), message = FALSE, warning = FALSE, fig.height = 7, fig.width = 7, fig.retina = 2 ) ``` ```{r get-data, echo=FALSE} # get source data for explanations library(soilDB) library(aqp) x <- fetchOSD(c('miami', 'pierre', 'tristan', 'lucy', 'musick'), extended = TRUE) ``` # Summary The soilDB package relies on a number of data sources curated as part of the [SoilWeb](https://casoilresource.lawr.ucdavis.edu/soilweb-apps) ecosystem of web applications, APIs, web coverage services (WCS), and internally-used lookup tables. These data are always updated shortly after the beginning of each fiscal year (October 1st) as part of the SSURGO refresh cycle. Within-cycle updates are typically performed quarterly. Many of these curated data sources and their evolution through time have been documented in: * O'Geen, A., Walkinshaw, M. and Beaudette, D.E. (2017), SoilWeb: A Multifaceted Interface to Soil Survey Information. Soil Sci. Soc. Am. J., 81: 853-862. https://doi.org/10.2136/sssaj2016.11.0386n * Beaudette, D.E. and O'Geen, A.T. (2010), An iPhone Application for On-Demand Access to Digital Soil Survey Information. Soil Sci. Soc. Am. J., 74: 1682-1684. https://doi.org/10.2136/sssaj2010.0144N * Beaudette, D.E. and O’Geen, A.T. (2009), Soil-Web: An online soil survey for California, Arizona, and Nevada, Comp. & Geo. Sci., 35:2119-2128, https://doi.org/10.1016/j.cageo.2008.10.016 Note that this vignette as published on CRAN or from a local R package installation will not contain output from inline code sections. See the Articles tab within the [soilDB package website](https://ncss-tech.github.io/soilDB/) for a more comprehensive version of this document. ## Primary Data Sources * [SSURGO](https://www.nrcs.usda.gov/resources/data-and-reports/soil-survey-geographic-database-ssurgo), the detailed soil survey of the United States * [STATSGO2](https://www.nrcs.usda.gov/resources/data-and-reports/description-of-statsgo2-database), the generalized soil survey of the United States * Official Series Descriptions (OSD) via [SoilKnowledgeBase](https://github.com/ncss-tech/SoilKnowledgeBase/) * Soil Classification (SC) database (no public version) * [USDA Ag. Handbook 296 -- Major Land Resource Area (MLRA)](https://www.nrcs.usda.gov/resources/data-and-reports/major-land-resource-area-mlra) * Climate ([PRISM](https://prism.oregonstate.edu/)) data and derivatives * Digital elevation model (DEM) derivatives * [NCSS Lab Data Mart](https://ncsslabdatamart.sc.egov.usda.gov/) (LDM) snapshot * NRCS [Rapid Carbon Assessment](https://www.nrcs.usda.gov/resources/data-and-reports/rapid-carbon-assessment-raca) (RaCA) ## `soilDB` Functions The following functions interact with SoilWeb APIs or Web Coverage Service (WCS) to request and download tabular or spatial data. Some functions return `SoilProfileCollection` objects, as defined in the [aqp](https://ncss-tech.github.io/aqp/) package. Tabular data and/or `SoilProfileCollection`: * `fetchOSD()`: get soil morphology and many soil series level summaries for one or more soil series names * `OSDquery()`: search the OSD records using the postgresql full text query syntax * `siblings()`: get names and basic information about soil series that co-occurr within soil map units containing a given series Spatial data: * `seriesExtent()`: get vector or raster representations of were soil series have been used in SSURGO * `taxaExtent()`: get raster representations of were taxa and formative elements (Soil Taxonomy) have been used in SSURGO WCS: * `mukey.wcs()`: get raster data describing map unit keys, from a variety of sources, by bounding-box * `ISSR800.wcs()`: get raster data describing generalized patterns in soil property and condition, by bounding-box * `soilColor.wcs()`: get raster soil color maps by bounding-box The `fetchOSD()` function, when specified with the `extended = TRUE` argument, will include the last updated date for curated sources. ```{r, eval=FALSE} # typical invocation library(soilDB) library(aqp) x <- fetchOSD(c('lucy'), extended = TRUE) # access metadata x$soilweb.metadata ``` An abbreviated example of these metadata will look like this: |product |last_update | |:-----------------------------|:-----------| |ISSR800 |2024-10-30 | |KSSL snapshot |2025-01-21 | |MLRA membership |2025-10-07 | |OSD fulltext |2026-01-02 | |OSD morphology |2026-01-02 | |SC database |2026-01-02 | |series climate summary |2025-10-08 | |series-ESID cross-tabulation |2025-10-07 | |SSURGO geomorphology |2025-10-06 | |SSURGO geomorphon |2026-01-11 | |SSURGO NCCPI Stats |2025-10-19 | |SSURGO parent material |2025-10-06 | |series extent |2025-10-04 | |taxa extent |2025-10-21 | Details about these data sources are provided below. ### Functions with Questionable Future Some functions represent temporary solutions to data delivery problems that made sense in the past. `fetchKSSL()` relies on an older snapshot of the NCSS LDM (2020), `fetchRaCA()` relies on a pre-release of the current RaCA database, and `SoilWeb_spatial_query()` has been superseded by soilDB functions which interface with Soil Data Access (see [`SDA_query()`](https://ncss-tech.github.io/soilDB/reference/SDA_query.html) and [`SDA_spatialQuery`](https://ncss-tech.github.io/soilDB/reference/SDA_spatialQuery.html)). * `fetchKSSL()`: get soil characterization and (limited) morphology data (soil color, structure, redoximorphic features) * `fetchRaCA()`: get records associated with the RaCA collection * `SoilWeb_spatial_query()`: simple interface to spatial intersection between SSURGO polygons and user supplied bounding-box There are currently no replacements for full functionality provided by `fetchKSSL()` and `fetchRaCA()`. See [`fetchLDM()`](https://ncss-tech.github.io/soilDB/reference/fetchLDM.html) for a modern approach to downloading NCSS LDM snapshot data. ## Related Web Applications The [`OSDquery()`](https://ncss-tech.github.io/soilDB/reference/OSDquery.html) function provides a convenient interface to the API behind the SoilWeb OSD Search applications: * https://soilmap4-1.lawr.ucdavis.edu/osd-search/index.php * https://casoilresource.lawr.ucdavis.edu/osd-search/search-entire-osd.php The web-based version of this search is limited to 100 records but `OSDquery()` has no record limit. The [`seriesExtent()`](https://ncss-tech.github.io/soilDB/reference/seriesExtent.html) function provides an interface to the data behind the [SoilWeb Series Extent Explorer](https://casoilresource.lawr.ucdavis.edu/see/) (SEE) application. The SEE website includes a short description of how the source data were created. The [`taxaExtent()`](https://ncss-tech.github.io/soilDB/reference/taxaExtent.html) function provides an interface to the data behind the [SoilWeb Soil Taxonomy Explorer](https://casoilresource.lawr.ucdavis.edu/ste/) (STE) application. the STE website includes a short description of how the source data were created. ## Relevant Tutorials * [Querying Soil Series Data](https://ncss-tech.github.io/AQP/soilDB/soil-series-query-functions.html) * [Soil Series Co-Occurrence Data](https://ncss-tech.github.io/AQP/soilDB/siblings.html) * [Competing Soil Series](https://ncss-tech.github.io/AQP/soilDB/competing-series.html) * [Exploring Geomorphic Summaries](https://ncss-tech.github.io/AQP/soilDB/exploring-geomorph-summary.html) * [What does a subgroup look like?](https://ncss-tech.github.io/AQP/soilDB/subgroup-series.html) * [Map Unit Key Web Coverage Service](https://ncss-tech.github.io/AQP/soilDB/WCS-demonstration-01.html) * [Investigating Soil Series Extent](https://ncss-tech.github.io/AQP/soilDB/series-extent.html) * [Geographic Extent of Soil Taxa](https://ncss-tech.github.io/AQP/soilDB/taxa-extent.html) # SoilWeb Curated Data Sources ## SC and OSD Derivatives Competing (same family classification) soil series information are derived from the Soil Classification database. Geographically associated soils are derived from the OSD records. These data are available via `fetchOSD()`. Competing soil series. ```{r eval=FALSE} # series names listed in "competing" have the family classification as "series" head(x$competing) ``` ```{r echo=FALSE} knitr::kable(head(x$competing)) ``` Geographically associated series. Series in the same region may have several geographically associated series in common, and can be modeled using [directed graphs](https://en.wikipedia.org/wiki/Directed_graph). ```{r eval=FALSE} # series names listed in "gas" are geographically associated with "series" head(x$geog_assoc_soils) ``` ```{r echo=FALSE} # series names listed in "gas" are geographically associated with "series" knitr::kable(head(x$geog_assoc_soils)) ``` ## SSURGO Derivatives SSURGO components are often named for soil series. Soil series summaries derived from SSURGO are coordinated using a normalized form of component name and soil series: * names are converted to upper case * component names are stripped of modifiers such as "variant" and "family" See the [Soil Survey Manual](https://www.nrcs.usda.gov/resources/guides-and-instructions/soil-survey-manual) for more complete definitions of "map unit", "component", and "soil series". The [Querying Soil Series Data](https://ncss-tech.github.io/AQP/soilDB/soil-series-query-functions.html) tutorial contains additional, relevant examples. ### MLRA Derived from the spatial intersection between MLRA and SSURGO polygons, with area computed on the ellipsoid from geographic coordinates. Membership values are area proportions by soil series. MLRA "membership" for the [LUCY](https://casoilresource.lawr.ucdavis.edu/sde/?series=lucy#osd) soil series. ```{r eval=FALSE} .mlra <- x$mlra[x$mlra$series == 'LUCY', ] .mlra[order(.mlra$membership, decreasing = TRUE), ] ``` ```{r echo=FALSE} .mlra <- x$mlra[x$mlra$series == 'LUCY', ] knitr::kable(.mlra[order(.mlra$membership, decreasing = TRUE), ], digits = 2, row.names = FALSE) ``` Soil series can be found in multiple MLRA, therefore MLRA membership can be modeled using [directed graphs](https://en.wikipedia.org/wiki/Directed_graph). ### Geomorphic Summaries Geomorphic summaries are computed from SSURGO component "geomorphology" tables. These represent a cross-tabulation of soil series name x geomorphic position in several landform and surface shape description systems. These systems are defined in the [Field Book for Describing and Sampling Soils](https://www.nrcs.usda.gov/resources/guides-and-instructions/field-book-for-describing-and-sampling-soils). There are several associated functions in the [sharpshootR](https://github.com/ncss-tech/sharpshootR/) package for visualizing these summaries (e.g. [`vizHillslopePosition()`](http://ncss-tech.github.io/sharpshootR/reference/vizHillslopePosition.html)). * hillslope position (2D) * geomorphic component: hills (3D) * geomorphic component: mountains (3D) * geomorphic component: terrace (3D) * geomorphic component: flats (3D) * surface curvature across-slope * surface curvature down-slope Note that the `n` column in each table is the number of component geomorphic data records associated with each soil series. It is possible for a single component to have multiple geomorphic positions defined. ```{r eval=FALSE} # hillslope position head(x$hillpos) # geomorphic component: hills head(x$geomcomp) # geomorphic component: mountains head(x$mtnpos) # geomorphic component: terraces head(x$terrace) # geomorphic component: flats head(x$flats) # surface curvature across slope head(x$shape_across) # surface curvature down slope head(x$shape_down) ``` ```{r echo=FALSE} knitr::kable(head(x$hillpos), digits = 2, caption = 'hillslope position') knitr::kable(head(x$geomcomp), digits = 2, caption = 'geomorphic component: hills') knitr::kable(head(x$mtnpos), digits = 2, caption = 'geomorphic component: mountains') knitr::kable(head(x$terrace), digits = 2, caption = 'geomorphic component: terraces') knitr::kable(head(x$flats), digits = 2, caption = 'geomorphic component: flats') knitr::kable(head(x$shape_across), digits = 2, caption = 'surface curvature across-slope') knitr::kable(head(x$shape_down), digits = 2, caption = 'surface curvature down-slope') ``` ### Siblings SoilWeb defines the term "siblings" as those components or soil series that co-occur within map units. The `siblings()` function returns siblings for a single soil series, with a tabulation of how many times each sibling shares a common map unit. Siblings of the PIERRE soil series, limited to just major components. The `n` column describes how many map units are shared between a sibling and the PIERRE series. ```{r, eval=FALSE} sib <- siblings('PIERRE', only.major = TRUE) head(sib$sib) ``` ```{r, echo=FALSE} sib <- siblings('PIERRE', only.major = TRUE) knitr::kable(sib$sib) ``` ### Other Percentiles of the [National Commodity Crop Productivity Index](https://www.nrcs.usda.gov/sites/default/files/2023-01/NCCPI-User-Guide.pdf) are computed from SSURGO component records, by soil series name. These include both irrigated and non-irrigated versions of the NCCPI. ```{r eval=FALSE} head(x$NCCPI) ``` ```{r echo=FALSE} knitr::kable(head(x$NCCPI), digits = 2) ``` Ecological classification membership are computed from map unit polygon area and component percentages. ```{r eval=FALSE} head(x$ecoclassid) ``` ```{r echo=FALSE} knitr::kable(head(x$ecoclassid), digits = 2) ``` ### Parent Material Summaries Parent material kind and origin, tabulated by soil series name. The `n` column is the number of component parent material records associated with a specific parent material kind or origin. The `total` column is the total number of component parent material records by soil series. The final column, `P`, is the associated proportion. ```{r eval=FALSE} head(x$pmkind) head(x$pmorigin) ``` ```{r echo=FALSE} knitr::kable(head(x$pmkind), digits = 2) knitr::kable(head(x$pmorigin), digits = 2) ``` ## Series and Taxa Extent Soil series extent vector data are available for all areas where SSURGO has been completed. Vector extents are described as polygons which generalize the underlying SSURGO map unit polygons (EPSG:4326). Soil series extent raster data are available as 800m grids (EPSG:5070) within CONUS only. Taxonomic classes and formative elements are available as 800m grids (EPSG:5070) within CONUS only. Grids are ## Web Coverage Services Web Coverage Services (WCS) provided by SoilWeb are still an experimental interface to snapshots of authoritative soil survey and derived data sources. The update cycle is slower than the typical SSURGO refresh. All WCS requests return data as compressed GeoTIFF (EPSG:5070). ## Climate Data and Derivatives These maps are derived from the daily, 800m resolution, PRISM data spanning 1981--2010. * Mean annual air temperature (deg. C), derived from daily minimum and maximum temperatures. * Mean accumulated annual precipitation (mm), derived from daily totals. * Mean monthly temperature (deg. C), derived from daily minimum and maximum temperatures. * Mean accumulated monthly precipitation (mm), derived from daily totals. * Estimated monthly potential evapotranspiration, [Thornthwaite, 1948](https://en.wikipedia.org/wiki/Potential_evapotranspiration#Thornthwaite_equation_(1948)) Percentiles of each variable are computed by soil series, from a sampling of one point per SSURGO map unit polygon. An example of annual and monthly climate percentiles. ```{r eval=FALSE} head(x$climate.annual) head(x$climate.monthly) ``` ```{r echo=FALSE} knitr::kable(head(x$climate.annual), digits = 0) knitr::kable(head(x$climate.monthly), digits = 0) ``` ### Frost-Free Period Number of days in the 50%, 80%, and 90% probability frost-free period, derived from daily minimum temperatures greater than 0 degrees C. These maps are based on 50/80/90 percent probability estimates for the last spring frost and first fall frost (day of year). See the related [algorithm documentation](http://ncss-tech.github.io/AQP/sharpshootR/FFD-estimates.html) for details. Values have been cross-checked with 300+ weather stations in CA.
ols(formula = ffd.50 ~ prism_ffd, data = z)
| Model Likelihood Ratio Test |
Discrimination Indexes |
|
|---|---|---|
| Obs 328 | LR χ2 526.00 | R2 0.799 |
| σ 42.2446 | d.f. 1 | R2adj 0.798 |
| d.f. 326 | Pr(>χ2) 0.0000 | g 95.935 |
Min 1Q Median 3Q Max
-278.344 -16.875 2.436 14.323 274.604
| β | S.E. | t | Pr(>|t|) | |
|---|---|---|---|---|
| Intercept | 15.1397 | 5.3455 | 2.83 | 0.0049 |
| prism_ffd | 0.9407 | 0.0261 | 35.98 | <0.0001 |