Open Soil Spectral Library
22 December, 2021
This document is UNDER CONSTRUCTION.
“Man’s most human characteristic is not his ability to learn, which he shares with many other species, but his ability to teach and store what others have developed and taught him.” Margaret Mead, Culture and Commitment: The New Relationships Between the Generations in the 1970s
1.1 Soil Spectroscopy for Global Good
SoilSpec4GG is a USDA-funded Food and Agriculture Cyberinformatics Tools Coordinated Innovation Network NIFA Award #2020-67021-32467 project. It brings together soil scientists, spectroscopists, informaticians, data scientists and software engineers to overcome some of the current bottlenecks preventing wider and more efficient use of soil spectroscopy. A series of working groups will be formed to address topics including calibration transfer, model choice, outreach & demonstration, and use of spectroscopy to inform global carbon cycle modeling. For more info refer to: https://soilspectroscopy.org/.
R tutorials and software developed to implement OSSL is available via: https://github.com/soilspectroscopy.
Soil spectroscopy for global good project works with other global initiatives including the FAO Global Soil Partnership and the IEEE P4005 Standards and Protocols for Soil Spectroscopy Working Group.
1.2 What is soil spectroscopy?
Soil spectroscopy is the measurement of light absorption when light in the visible, near infrared or mid infrared (Vis–NIR–MIR) regions of the electromagnetic spectrum is applied to a soil surface. The proportion of the incident radiation reflected by soil is sensed through Vis–NIR–MIR reflectance spectroscopy. These characteristic spectra (see Fig. below) can then be used to estimate numerous soil attributes including: minerals, organic compounds and water.
1.3 Open Soil Spectral Library
Open Soil Spectral Library (OSSL) is a suite of datasets, web-services, software and tutorials. It includes (see also https://github.com/soilspectroscopy):
- A soil spectral DB (mongoDB),
- API calibration service available from https://api.soilspectroscopy.org,
- Front-end solutions: OSSL Engine and Explorer,
- An R package
osslwith all functionality used by the API,
- Registry of global and local calibration models (https://github.com/soilspectroscopy/),
- Tutorials included in this book,
The OSSL-DB has been prepared following the OSSL schema which is available at:
As a general rule of thumb we recommend all contributors to use the following general OSSL scheme to organize Soil Observations & Measurements with four main tables and metadata + legends organized in other tables:
To access global compilation of soil legacy point data sets refer to https://github.com/OpenGeoHub/SoilSamples repository. To access and use Soil Spectroscopy tools also refer to https://soilspectroscopy.org/.
1.4 OSSL mongoDB
MongoDB is an Open Source noSQL DB hence fast and fully scalable and extendable (affordable costs for cloud solutions such as MongoDB Atlas and similar). TensorFlow and other cutting-edge ML algorithms can be easily integrated and served through a GUI.
To access OSSL DB best use the mongoDB either through a graphical user interface using Robo 3T, or by using the mongodb via R. The following parameters (database credentials) allow ready only access to DB:
- Name: soilspec4gg
- Address: api.soilspectroscopy.org
- Database: soilspec4gg
- Username: soilspec4gg
- Password: soilspec4gg
First, we need to specify the parameters:
library(mongolite) library(jsonify) source("R/ossl_functions.R") = list( soilspec4gg.db host = 'api.soilspectroscopy.org', name = 'soilspec4gg', user = 'soilspec4gg', pw = 'soilspec4gg' )$url <- paste0( soilspec4gg.db'mongodb://', soilspec4gg.db$user, ':', $pw, '@', soilspec4gg.db$host, '/', soilspec4gg.db$name, '?ssl=true' soilspec4gg.db)
Next, we can initiate connection:
##  "Creating the access for mongodb collections."
and now we can query and load data directly into R, for example to get a sample from AfSIS1:
= "icr006475" id = soilspec4gg.samplesById(id)soilspec.sample
##  "Accessing mongodb collections." ## Found 4 records... Imported 4 records. Simplifying into dataframe... ## Found 4 records... Imported 4 records. Simplifying into dataframe... ## Found 1 records... Imported 1 records. Simplifying into dataframe... ## Imported 0 records. Simplifying into dataframe...
##  16 1758
1.5 OSSL API
OSSL API (Application Programming Interface) is also available and can be used to construct requests to fetch data, models and generate predictions. The outputs of predictions can be obtained as JSON or CSV files, making the system fully interoperable. The OSSL API is at the moment based on using the plumber package and is provided for testing purposes only. Users can calibrate maximum 20 rows per request, but these limits will be gradually extended.
1.6 Target variables of interest
Soil spectral scan, through the calibration procedure, are used to determine various soil variables. GLOSOLAN’s Standard Operating Procedures (SOPs) list four groups of soil variables of interest to international soil spectroscopy projects:
Soil chemical variables:
- Exchangeable cations and CEC,
- Extractable microelements,
- Trace and major element analyses,
- Electrical conductivity and total soluble salt content,
- Soluble sulfate and chloride analysis,
- Special analysis for peats, mineral and organic soils, agriculture and forest,
Soil physical variables:
- Bulk density,
- Coarse fragments,
- Particle-size distribution,
- Water retention curve,
- Hydraulic conductivity function,
- Aggregate stability,
- Moisture content,
Soil biological variables:
- Microbial biomass,
- Soil Respiration,
- Enzyme activity,
- Microbial identification,
- Heavy metal elements: As, Hg, Cu, Cd, Pb and similar,
- Other soil pollutants,
This list is constantly updated. In the OSSL we focus on soil variables for which there is enough global calibration measurements to fit reasonable models. Currently, the largest component of the OSSL is the USDA’s KSSL data that list about 60 variables for which there is enough data to fit calibration models.
1.7 Contributing data
We encourage public and private entities to help this project and share SSL data. The following four modes of data sharing are especially encouraged:
- Open your data by releasing it under Creative Commons (CC-BY, CC-BY-SA)
or Open Data Commons Open Database License (ODbL). This data can then directly imported into the OSSL.
- Donate a small part (e.g. 5%) of your data (release under CC-BY, CC-BY-SA and/or ODbL).
This data can then directly imported into the OSSL.
- Allow SoilSpectroscopy.org project direct access to your data so that we can run data mining
and then release ONLY results of data mining under some Open Data license.
- Use OSSL data to produce new derivative products, then share them through own infrastructures OR contact us for providing hosting support.
We can sign professional Data Sharing Agreements with data producers that specify in detail how will the data be used. Our primary interest is in enabling research, sharing and use of models (calibration and prediction) and collaboration of groups across borders.
1.8 Contributing documentation
Please feel free to contribute technical documentation. See GitHub repository for more detailed instructions.
If you’ve contribute, add also your name and Twitter, ORCID or blog link below:
Whilst utmost care has been taken by the Soil Spectroscopy project and data authors while collecting and compiling the data, the data is provided “as is”. Woodwell Climate Research Center, University of Florida, OpenGeoHub foundation and its suppliers and licensors hereby disclaim all warranties of any kind, express or implied, including, without limitation, the warranties of merchantability, fitness for a particular purpose and non-infringement. Neither Woodwell Climate Research Center, University of Florida, OpenGeoHub foundation nor its suppliers and licensors, makes any warranty that the Website will be error free or that access thereto will be continuous or uninterrupted. You understand that you download from, or otherwise obtain content or services through, the Website at your own discretion and risk.
In no event shall the data authors, the Soil Spectroscopy project, or relevant funding agencies be liable for any actual, incidental or consequential damages arising from use of the data. By using the Soil Spectroscopy project data, the user expressly acknowledges that the Data may contain some nonconformities, defects, or errors. No warranty is given that the data will meet the user’s needs or expectations or that all nonconformities, defects, or errors can or will be corrected. The user should always verify actual data; therefore the user bears all responsibility in determining whether the data is fit for the user’s intended use.
This document is under construction. If you notice an error or outdated information, please submit a correction / pull request or open an issue.
This is a community project. No profits are being made from building and serving Open Spectral Library. If you would like to become a sponsor of the project, please contact us via: https://soilspectroscopy.org/contact/.
This website/book and attached software is free to use, and is licensed under the MIT License. The OSSL training data and models, if not otherwise indicated, is available either under the Creative Commons Attribution 4.0 International CC-BY and/or CC-BY-SA license / Open Data Commons Open Database License (ODbL) v1.0.
1.12 Suggested literature
Some other connected publications and initiatives describing collation, import and use of soil spectroscopy data:
- Angelopoulou, T., Balafoutis, A., Zalidis, G., & Bochtis, D. (2020). From laboratory to proximal sensing spectroscopy for soil organic carbon estimation—a review. Sustainability, 12(2), 443. https://doi.org/10.3390/su12020443
- Ayres, E. (2019). Quantitative Guidelines for Establishing and Operating Soil Archives. Soil Science Society of America Journal, 83(4), 973-981. https://doi.org/10.2136/sssaj2019.02.0050
- Benedetti, F. and van Egmond, F. (2021). Global Soil Spectroscopy Assessment. Spectral soil data – Needs and capacities. Rome, FAO. https://doi.org/10.4060/cb6265en
- Dudek, M., Kabała, C., Łabaz, B., Mituła, P., Bednik, M., & Medyńska-Juraszek, A. (2021). Mid-Infrared Spectroscopy Supports Identification of the Origin of Organic Matter in Soils. Land, 10(2), 215. https://doi.org/10.3390/land10020215
- GLOSOLAN’s Standard Operating Procedures (SOPs);
- Nocita, M., Stevens, A., van Wesemael, B., Aitkenhead, M., Bachmann, M., Barthès, B., … & Wetterlind, J. (2015). Soil spectroscopy: An alternative to wet chemistry for soil monitoring. Advances in agronomy, 132, 139-159. https://doi.org/10.1016/bs.agron.2015.02.002
- Sanderman, J., Savage, K., Dangal, S. R., Duran, G., Rivard, C., Cavigelli, M. A., … & Stewart, C. (2021). Can Agricultural Management Induced Changes in Soil Organic Carbon Be Detected Using Mid-Infrared Spectroscopy?. Remote Sensing, 13(12), 2265. https://doi.org/10.3390/rs13122265
- Sanderman, J., Savage, K., & Dangal, S. R. (2020). Mid-infrared
spectroscopy for prediction of soil health indicators in the United
States. Soil Science Society of America Journal, 84(1), 251–261.
- Wijewardane, N. K., Ge, Y., Wills, S., & Libohova, Z. (2018). Predicting
physical and chemical properties of US soils with a mid-infrared
reflectance spectral library. Soil Science Society of America Journal,
82(3), 722–731. https://doi.org/10.2136/sssaj2017.10.0361
- Wadoux, A.M.J.-C., Malone, B., McBratney, A.B., Fajardo, M., Minasny, B., (2021). Soil Spectral Inference with R: Analysing Digital Soil Spectra Using the R Programming Environment. Progress in Soil Science, Springer Nature, ISBN: 9783030648961, 274 pp.
Open Soil Spectral Library was possible due to the kind contributions by public and private organizations. Listed based on the date of import:
- USDA-NRCS Kellogg Soil Survey Laboratory mid-infrared (MIR) spectral library (Wijewardane et al. 2018; Sanderman, Savage, and Dangal 2020) was used as the basis for this data set and corresponding services; we are especially grateful to Rich Ferguson & Scarlett Murphy (NRCS USDA) for their help with
importing and using the KSSL Soil Spectral Library;
- ICRAF-ISRIC Soil VNIR Spectral Library (Garrity and Bindraban 2004; Aitkenhead and Black 2018) 785 soil profiles (4,438 samples) selected from the Soil Information System (ISIS) of the International Soil Reference and Information Centre (ISRIC) https://doi.org/10.34725/DVN/MFHA9C;
- AfSIS-I Soil Spectral Library Mid-Infrared Spectra (MIRS) from ICRAF Soil and Plant Spectroscopy Laboratory Africa Soil Information Service (AfSIS) Phase I 2009-2013 (Vagen et al. 2020), a collaborative project funded by the Bill and Melinda Gates Foundation (BMGF). Partners included: CIAT-TSBF, ISRIC, CIESIN, The Earth Institute at Columbia University and World Agroforestry (ICRAF) https://doi.org/10.34725/DVN/QXCWP1; AfSIS-II Soil Spectral Library with Mid-Infrared Spectra (MIRS) covering the national Soil Information Systems: TanSIS (Tanzania), NiSIS (Nigeria) and GhanSIS (Ghana) available from https://doi.org/10.34725/DVN/XUDGJY, https://doi.org/10.34725/DVN/WLAKR2 and https://doi.org/10.34725/DVN/SPRSFN and hosted by the ICRAF Soil and Plant Spectroscopy Laboratory;
- LUCAS topsoil (VisNIR) Soil Spectral Library (Orgiazzi et al. 2018) was made available by the European Commission through the European Soil Data Centre managed by the Joint Research Centre (JRC), http://esdac.jrc.ec.europa.eu/; we have degraded location accuracy of points so that exact locations are about 1-km off;
- The Central African Soil Spectral Library described in detail in Summerauer et al. (2021) contains limited number of samples representing Central Africa https://doi.org/10.5281/zenodo.4320395;
- The National Ecological Observatory Network (NEON) Soil Spectral is based on the NEON soil data (Ayres 2019), which were scanned by the Woodwell Climate Research and USDA-NRCS Kellogg Soil Survey Laboratory; the (NEON) Megapit Soil Archive is a program sponsored by the National Science Foundation and operated under cooperative agreement by Battelle;
We are grateful to Wanderson de Sousa Mendes (Leibniz Centre for Agricultural Landscape Research (ZALF)) for help with initial screening of the data for the development of the R code for processing soil spectroscopy data.
For more advanced uses of the soil spectral libraries we advise to contact the original data producers especially to get help with using, extending and improving the original SSL data.
We are also grateful to USDA National Institute of Food and Agriculture #2020-67021-32467 for providing funding for this project.