Predictive Soil Spectroscopy

Authors
Affiliations

José Lucas Safanelli

Woodwell Climate Research Center

Robert Minarik

OpenGeoHub Foundation

Tomislav Hengl

OpenGeoHub Foundation

Jonathan Sanderman

Woodwell Climate Research Center

Published

June 17, 2026

Welcome!

Welcome to this training guide on Predictive Soil Spectroscopy! This material was originally developed for an in-person workshop in 2023 and has since been used in multiple training events and reused by many practitioners worldwide. It is now maintained as a living document, open for anyone to access and build upon.

Training workshops:
- Predictive Soil Spectroscopy (Pre-conforence workshop) at the ASA-CSSA-SSSA International Annual Meeting 2023 (St. Louis, MO, USA).
- Introduction to Soil Spectroscopy Modeling at FUNCEME, Fortaleza, CE, Brazil, 2024.
- Predictive Soil Spectroscopy (a session of the full-day pre-conference workshop “Soil Spectroscopy: The nuts and bolts of soil spectroscopy science and applications”) at the 2026 CSSS-CSA Meeting (University of Guelph, ON, Canada).


Soil spectroscopy, specifically Diffuse Reflectance Spectroscopy, is rapidly becoming a routine tool for soil analysis in academia and in industry.

One of the most popular uses of soil spectroscopy is for the rapid and low-cost estimation of particle size distribution, carbon fractions, and clay minerals.

This guide touches on the basics of soil spectroscopy development including project design, considerations for building a spectral library, working with large and public spectral libraries, and predictive modeling.

Most of the learning will focus on using the free and open source R programming language.

This material was updated with R version 4.5.3, and it is recommended to use RStudio as the graphical user interface. Package versions are managed with renv for reproducibility. To restore the exact package environment used in this guide, run renv::restore() after cloning or downloading the repository.

Prerequisites

This training is mostly focused on the use of tidy programming principles with pipe operators, leveraging the R packages from the tidyverse like dplyr, tidyr, and ggplot2.

For the machine learning framework, this guide uses the tidymodels ecosystem, which provides a consistent and user-friendly interface for model building and evaluation.

Alternatively, we have included a chemometrics chapter where some common tools and algorithms for working with spectral data are introduced. This was possible with the availability of the amazing package mdatools.

We do, however, recommend that you keep an eye on this online material as it may evolve in time and new methods may be incorporated.

If you are interested in getting started in R using tidy packages and principles, we strongly recommend checking the R 4 Data Science book page:

  • For installing R and RStudio, it is recommended to check the Prerequisites page.
  • Learning how to set a basic project on RStudio is neatly described in Workflow: projects.
  • We are going to have several demonstrations of data import and wrangling by piped operations, and plot visualizations with ggplot.

Other spectral operations, like importing raw spectral files, preprocessing, compression, and modeling can be done with dedicated libraries, e.g., asdreader, opusreader2, prospectr, resemble, tidymodels, and many others.

Disclaimer

Woodwell Climate Research Center, OpenGeoHub Foundation and its suppliers and licensors hereby disclaim all warranties of any kind, express or implied, including, without limitation, the warranties of merchantability, fitness for a particular purpose and non-infringement. Neither Woodwell Climate Research Center, OpenGeoHub Foundation nor its suppliers and licensors, makes any warranty that the Website will be error free or that access thereto will be continuous or uninterrupted. You understand that you download from, or otherwise obtain content or services through, the Website at your own discretion and risk.

If you notice an error or outdated information, please submit a correction/pull request or open an issue.

License

This website/book and attached software is free to use, and is licensed under the MIT License. The OSSL training data and models, if not otherwise indicated, are available either under the Creative Commons Attribution 4.0 International CC-BY and/or CC-BY-SA license / Open Data Commons Open Database License (ODbL) v1.0.

Acknowledgments

Soil Spectroscopy for Global Good (SS4GG) is an initiative organized by Woodwell Climate Research Center and OpenGeoHub Foundation. The development of this training material was originally supported by the USDA National Institute of Food and Agriculture award #2020-67021-32467 and is currently maintained with institutional support. We are grateful to all participants of the workshops held since 2023 for their feedback and contributions.

Citing

José Lucas Safanelli, Robert Minarik, Jonathan Sanderman, and Tomislav Hengl. Predictive Soil Spectroscopy. 2025. https://doi.org/10.5281/zenodo.16890797