Resource information
The aim of this study was to develop partial least squares (PLS) models to predict the concentrations of 45 elements in soils extracted by the aqua regia (AR) method using diffuse reflectance Fourier Transform mid-infrared (MIR; 4000–500cm⁻¹) spectroscopy. A total of 4130 soils from the GEMAS European soil sampling program (geochemical mapping of agricultural soils and grazing land of Europe) were selected. From the full soil set, 1000 samples were randomly selected to develop PLS models. Cross-validation was used for model training and the remaining 3130 samples used for model testing. According to the ratio of standard deviation to root mean square error (RPD) of the predictions, the elements were allocated into two main groups; Group 1 (successful calibrations, 30 elements), including those elements with RPD⩾1.5 (the coefficient of determination, R², also provided): Ca (3.3, 0.91), Mg (2.5, 0.84), Al (2.4, 0.83), Fe (2.2, 0.79), Ga (2.1, 0.78), Co (2.1, 0.77), Ni (2.0, 0.77), Sc (2.1, 0.76), Ti (2.0, 0.75), Li (1.9, 0.73), Sr (1.9, 0.72), K (1.8, 0.70), Cr (1.8, 0.70), Th (1.8, 0.69), Be (1.7, 0.66), S (1.7, 0.66), B (1.6, 0.63), Rb (1.6, 0.62), V (1.6, 0.62), Y (1.6, 0.61), Zn (1.6, 0.60), Zr (1.6, 0.59), Nb (1.5, 0.58), Ce (1.5, 0.58), Cs (1.5, 0.58), Na (1.5, 0.57), In (1.5, 0.57), Bi (1.5, 0.56), Cu (1.5, 0.55), and Mn (1.5, 0.54); and Group 2 for 15 elements with RPD values lower than 1.5: As (1.4, 0.52), Ba (1.4, 0.52), La (1.4, 0.52), Tl (1.4, 0.51), P (1.4, 0.46), U (1.4, 0.45), Sb (1.3, 0.46), Mo (1.3, 0.43), Pb (1.3, 0.42), Se (1.3, 0.40), Cd (1.3, 0.40), Sn (1.3, 0.38), Hg (1.2, 0.33), Ag (1.2, 0.32) and W (1.1, 0.19). The success of the PLS models was found to be dependent on their relationships (directly or indirectly) with MIR-active soil components.