August 31, 2023
Journal Article

Practical Guide to Chemometric Analysis of Optical Spectroscopic Data

Abstract

The methodology and mathematical treatment of several classic multivariate methods for the analysis of spectroscopic data is demonstrated in a straightforward way that can be used as a basis for teaching an undergraduate introductory course on chemometric analysis. The multivariate techniques of classical least squares (CLS), principal component regression (PCR), and partial least squares (PLS), as well as the univariate Beer’s law method have been described and compared, building students’ understanding by starting with the univariate method and progressing step by step into the multivariate methods. Equations for the production of regression vectors from training set spectral data is described and their use demonstrated for the prediction of constituent concentrations on a separate validation set of spectra. Extreme care is taken to ensure consistency in variable formatting of data matrices. This provides a key foundation to understanding how spectral data are manipulated using these different mathematical approaches for building quantitative regression models. Each method is applied to a real-world data set, and the results are discussed to show students the types of information that can be gleaned from each method. A training set comprised of 20 infrared absorbance spectra containing 3 constituents (benzene, polystyrene, and gasoline) of known composition are used to demonstrate the matrix operations for each regression method. A separate set of 12 real-world napalm samples (containing benzene, polystyrene and gasoline) are used as a validation set to demonstrate the ability to utilize the regression models on an unknown dataset. A toolbox (PNNL Chemometric Toolbox) written in MATLAB language is supplied in the Supplemental Information file and can be used as a companion for understanding the development and deployment of the chemometric algorithms described in this paper. The datasets of the infrared spectra are also supplied, allowing users to build and inspect the chemometric models on their own. Finally, the Toolbox includes scripts to assist users in loading their own datasets into MATLAB and performing CLS, PCR, and PLS on their data.

Published: August 31, 2023

Citation

Lackey H.E., R.L. Sell, G.L. Nelson, T.A. Bryan, A.M. Lines, and S.A. Bryan. 2023. Practical Guide to Chemometric Analysis of Optical Spectroscopic Data. Journal of Chemical Education 100, no. 7:2608-2626. PNNL-SA-178491. doi:10.1021/acs.jchemed.2c01112

Research topics