1 About

This document is UNDER CONSTRUCTION.

“Man’s most human characteristic is not his ability to learn, which he shares with many other species, but his ability to teach and store what others have developed and taught him.” Margaret Mead, Culture and Commitment: The New Relationships Between the Generations in the 1970s

1.1 Soil Spectroscopy for Global Good

SoilSpec4GG is a USDA-funded Food and Agriculture Cyberinformatics Tools Coordinated Innovation Network NIFA Award #2020-67021-32467 project. It brings together soil scientists, spectroscopists, informaticians, data scientists and software engineers to overcome some of the current bottlenecks preventing wider and more efficient use of soil spectroscopy. A series of working groups will be formed to address topics including calibration transfer, model choice, outreach & demonstration, and use of spectroscopy to inform global carbon cycle modeling. For more info refer to: https://soilspectroscopy.org/.

R tutorials and software developed to implement OSSL is available via: https://github.com/soilspectroscopy.

Soil spectroscopy for global good project works with other global initiatives including the FAO Global Soil Partnership and the IEEE P4005 Standards and Protocols for Soil Spectroscopy Working Group.

1.2 What is soil spectroscopy?

Soil spectroscopy is the measurement of light absorption when light in the visible, near infrared or mid infrared (Vis–NIR–MIR) regions of the electromagnetic spectrum is applied to a soil surface. The proportion of the incident radiation reflected by soil is sensed through Vis–NIR–MIR reflectance spectroscopy. These characteristic spectra (see Fig. below) can then be used to estimate numerous soil attributes including: minerals, organic compounds and water.

Schematic explanation of the soil spectroscopy. For more info see: https://soilspectroscopy.org/.

Figure 1.1: Schematic explanation of the soil spectroscopy. For more info see: https://soilspectroscopy.org/.

Example of spectral signatures for large number of VisNIR scans (KSSL).

Figure 1.2: Example of spectral signatures for large number of VisNIR scans (KSSL).

1.3 Open Soil Spectral Library


Open Soil Spectral Library (OSSL) is a suite of datasets, web-services, software and tutorials. It includes (see also https://github.com/soilspectroscopy):

The OSSL-DB has been prepared following the OSSL schema which is available at:

As a general rule of thumb we recommend all contributors to use the following general OSSL scheme to organize Soil Observations & Measurements with four main tables and metadata + legends organized in other tables:

Recommended OSSL database schema.

Figure 1.3: Recommended OSSL database schema.

To access global compilation of soil legacy point data sets refer to https://github.com/OpenGeoHub/SoilSamples repository. To access and use Soil Spectroscopy tools also refer to https://soilspectroscopy.org/.

Up-to-date distribution of points with VisNIR scans.

Figure 1.4: Up-to-date distribution of points with VisNIR scans.

Up-to-date distribution of points with MIR scans.

Figure 1.5: Up-to-date distribution of points with MIR scans.

1.4 OSSL mongoDB

MongoDB is an Open Source noSQL DB hence fast and fully scalable and extendable (affordable costs for cloud solutions such as MongoDB Atlas and similar). TensorFlow and other cutting-edge ML algorithms can be easily integrated and served through a GUI.

To access OSSL DB best use the mongoDB either through a graphical user interface using Robo 3T, or by using the mongodb via R. The following parameters (database credentials) allow ready only access to DB:

  • Name: soilspec4gg
  • Address: api.soilspectroscopy.org
  • Database: soilspec4gg
  • Username: soilspec4gg
  • Password: soilspec4gg
Accessing the OSSL DB using [MongoDB GUI](https://robomongo.org/download).

Figure 1.6: Accessing the OSSL DB using MongoDB GUI.

First, we need to specify the parameters:

soilspec4gg.db = list(
  host = 'api.soilspectroscopy.org',
  name = 'soilspec4gg',
  user = 'soilspec4gg',
  pw = 'soilspec4gg'
soilspec4gg.db$url <- paste0(
  'mongodb://', soilspec4gg.db$user, ':', 
  soilspec4gg.db$pw, '@', 
  soilspec4gg.db$host, '/', 
  soilspec4gg.db$name, '?ssl=true'

Next, we can initiate connection:

## [1] "Creating the access for mongodb collections."

and now we can query and load data directly into R, for example to get a sample from AfSIS1:

id = "icr006475" 
soilspec.sample = soilspec4gg.samplesById(id)
## [1] "Accessing mongodb collections."
 Found 4 records...
 Imported 4 records. Simplifying into dataframe...
 Found 4 records...
 Imported 4 records. Simplifying into dataframe...
 Found 1 records...
 Imported 1 records. Simplifying into dataframe...
 Imported 0 records. Simplifying into dataframe...
## [1]   16 1758


OSSL API (Application Programming Interface) is also available and can be used to construct requests to fetch data, models and generate predictions. The outputs of predictions can be obtained as JSON or CSV files, making the system fully interoperable. The OSSL API is at the moment based on using the plumber package and is provided for testing purposes only. Users can calibrate maximum 20 rows per request, but these limits will be gradually extended.

OSSL API is available for testing.

Figure 1.7: OSSL API is available for testing.

1.6 Target variables of interest

Soil spectral scan, through the calibration procedure, are used to determine various soil variables. GLOSOLAN’s Standard Operating Procedures (SOPs) list four groups of soil variables of interest to international soil spectroscopy projects:

Soil chemical variables:

  • pH,
  • Carbon,
  • Phosphorous,
  • Potassium,
  • Nitrogen,
  • Exchangeable cations and CEC,
  • Extractable microelements,
  • Trace and major element analyses,
  • Gypsum,
  • Electrical conductivity and total soluble salt content,
  • Soluble sulfate and chloride analysis,
  • Special analysis for peats, mineral and organic soils, agriculture and forest,

Soil physical variables:

  • Bulk density,
  • Coarse fragments,
  • Particle-size distribution,
  • Water retention curve,
  • Porosity,
  • Hydraulic conductivity function,
  • Aggregate stability,
  • Moisture content,

Soil biological variables:

  • Microbial biomass,
  • Soil Respiration,
  • Enzyme activity,
  • Microbial identification,

Soil contaminants:

  • Heavy metal elements: As, Hg, Cu, Cd, Pb and similar,
  • Other soil pollutants,

This list is constantly updated. In the OSSL we focus on soil variables for which there is enough global calibration measurements to fit reasonable models. Currently, the largest component of the OSSL is the USDA’s KSSL data that list about 60 variables for which there is enough data to fit calibration models.

1.7 Contributing data

We encourage public and private entities to help this project and share SSL data. The following four modes of data sharing are especially encouraged:

  1. Open your data by releasing it under Creative Commons (CC-BY, CC-BY-SA)
    or Open Data Commons Open Database License (ODbL). This data can then directly imported into the OSSL.
  2. Donate a small part (e.g. 5%) of your data (release under CC-BY, CC-BY-SA and/or ODbL). This data can then directly imported into the OSSL.
  3. Allow SoilSpectroscopy.org project direct access to your data so that we can run data mining and then release ONLY results of data mining under some Open Data license.
  4. Use OSSL data to produce new derivative products, then share them through own infrastructures OR contact us for providing hosting support.

We can sign professional Data Sharing Agreements with data producers that specify in detail how will the data be used. Our primary interest is in enabling research, sharing and use of models (calibration and prediction) and collaboration of groups across borders.

We take especial care that your data is secured, encrypted where necessary, and kept safely, closely following our privacy policy and terms of use.

1.8 Contributing documentation

Please feel free to contribute technical documentation. See GitHub repository for more detailed instructions.

Information outdated or missing? Please open an issue or best do a correction in the text and then make a pull request.

1.9 Contributors

If you’ve contribute, add also your name and Twitter, ORCID or blog link below:

Jonathan Sanderman, Tomislav Hengl, Katherine Todd-Brown, Leandro L. Parente, Wanderson de Sousa Mendes

1.10 Disclaimer

Whilst utmost care has been taken by the Soil Spectroscopy project and data authors while collecting and compiling the data, the data is provided “as is”. Woodwell Climate Research Center, University of Florida, OpenGeoHub foundation and its suppliers and licensors hereby disclaim all warranties of any kind, express or implied, including, without limitation, the warranties of merchantability, fitness for a particular purpose and non-infringement. Neither Woodwell Climate Research Center, University of Florida, OpenGeoHub foundation nor its suppliers and licensors, makes any warranty that the Website will be error free or that access thereto will be continuous or uninterrupted. You understand that you download from, or otherwise obtain content or services through, the Website at your own discretion and risk.

In no event shall the data authors, the Soil Spectroscopy project, or relevant funding agencies be liable for any actual, incidental or consequential damages arising from use of the data. By using the Soil Spectroscopy project data, the user expressly acknowledges that the Data may contain some nonconformities, defects, or errors. No warranty is given that the data will meet the user’s needs or expectations or that all nonconformities, defects, or errors can or will be corrected. The user should always verify actual data; therefore the user bears all responsibility in determining whether the data is fit for the user’s intended use.

This document is under construction. If you notice an error or outdated information, please submit a correction / pull request or open an issue.

This is a community project. No profits are being made from building and serving Open Spectral Library. If you would like to become a sponsor of the project, please contact us via: https://soilspectroscopy.org/contact/.

1.11 Licence

This website/book and attached software is free to use, and is licensed under the MIT License. The OSSL training data and models, if not otherwise indicated, is available either under the Creative Commons Attribution 4.0 International CC-BY and/or CC-BY-SA license / Open Data Commons Open Database License (ODbL) v1.0.

1.12 Suggested literature

Some other connected publications and initiatives describing collation, import and use of soil spectroscopy data:

1.13 Acknowledgments

Open Soil Spectral Library was possible due to the kind contributions by public and private organizations. Listed based on the date of import:

We are grateful to Wanderson de Sousa Mendes (Leibniz Centre for Agricultural Landscape Research (ZALF)) for help with initial screening of the data for the development of the R code for processing soil spectroscopy data.

For more advanced uses of the soil spectral libraries we advise to contact the original data producers especially to get help with using, extending and improving the original SSL data.

We are also grateful to USDA National Institute of Food and Agriculture #2020-67021-32467 for providing funding for this project.


Aitkenhead, Matt J, and Helaina IJ Black. 2018. Exploring the impact of different input data types on soil variable estimation using the ICRAF-ISRIC global soil spectral database.” Applied Spectroscopy 72 (2): 188–98. https://doi.org/10.1177/0003702817739013.
Ayres, Edward. 2019. “Quantitative Guidelines for Establishing and Operating Soil Archives.” Soil Science Society of America Journal 83 (4): 973–81. https://doi.org/10.2136/sssaj2019.02.0050.
Garrity, Dennis, and Prem Bindraban. 2004. A Globally Distributed Soil Spectral Library Visible Near Infrared Diffuse Reflectance Spectra. Nairobi, Kenya: ICRAF (World Agroforestry Centre) / ISRIC (World Soil Information) Spectral Library. https://doi.org/10.34725/DVN/MFHA9C.
Orgiazzi, Alberto, Cristiano Ballabio, Panagiotis Panagos, Arwyn Jones, and Oihane Fernández-Ugalde. 2018. LUCAS Soil, the largest expandable soil dataset for Europe: a review.” European Journal of Soil Science 69 (1): 140–53. https://doi.org/10.1111/ejss.12499.
Sanderman, Jonathan, Kathleen Savage, and Shree RS Dangal. 2020. Mid-infrared spectroscopy for prediction of soil health indicators in the United States.” Soil Science Society of America Journal 84 (1): 251–61. https://doi.org/10.1002/saj2.20009.
Summerauer, L., P. Baumann, L. Ramirez-Lopez, M. Barthel, M. Bauters, B. Bukombe, M. Reichenbach, et al. 2021. “The Central African Soil Spectral Library: A New Soil Infrared Repository and a Geographical Prediction Analysis.” SOIL 7 (2): 693–715. https://doi.org/10.5194/soil-7-693-2021.
Vagen, Tor-Gunnar, Leigh Ann Winowiecki, Luseged Desta, Ebagnerin Jerome Tondoh, Elvis Weullow, Keith Shepherd, and Andrew Sila. 2020. Mid-Infrared Spectra (MIRS) from ICRAF Soil and Plant Spectroscopy Laboratory: Africa Soil Information Service (AfSIS) Phase I 2009-2013. World Agroforestry - Research Data Repository. https://doi.org/10.34725/DVN/QXCWP1.
Wijewardane, Nuwan K, Yufeng Ge, Skye Wills, and Zamir Libohova. 2018. Predicting physical and chemical properties of US soils with a mid-infrared reflectance spectral library.” Soil Science Society of America Journal 82 (3): 722–31. https://doi.org/10.2136/sssaj2017.10.0361.