A gradient descent boosting spectrum modeling method based on back interval partial least squares

Ren, D.; Qu, F.; Lv, K.; Zhang, Z.; Xu, Honglei; Wang, X.

doi:10.1016/j.neucom.2015.07.109

Access Status

Fulltext not available

Authors

Ren, D.

Qu, F.

Lv, K.

Zhang, Z.

Xu, Honglei

Wang, X.

Date

2015

Type

Journal Article

Metadata

Show full item record

Citation

Ren, D. and Qu, F. and Lv, K. and Zhang, Z. and Xu, H. and Wang, X. 2015. A gradient descent boosting spectrum modeling method based on back interval partial least squares. Neurocomputing. 117: pp. 1038-1046.

Source Title

Neurocomputing

DOI

10.1016/j.neucom.2015.07.109

ISSN

0925-2312

Faculty

Faculty of Science and Engineering

School

Department of Mathematics and Statistics

URI

http://hdl.handle.net/20.500.11937/7025

Collection

Curtin Research Publications

Abstract

When the technique of boosting regression is applied to near-infrared spectroscopy, the full spectrum of samples are generally used to perform partial least squares (PLS) modeling. However, there is a large amount of redundant information and noise contained in the full spectrum. This not only increases the complexity of the model, but also reduces its predictive performance. In addition, the boosting method is sensitive to data noise. When the data are mixed with too much noise, the generalization performance of boosting will decrease, and the prediction error and the variance of PLS will be relatively large. To solve these problems, a gradient descent boosting ensemble method combined with backward interval PLS (GD-Boosting-BiPLS) is proposed in this paper. BiPLS is used to select the effective variables for the boosting base model, and each base model is trained sequentially by resampling. The spectral segmentation parameter of BiPLS and the iteration parameter of boosting are fused, and the weight of each base model is distributed by the gradient descent strategy. This leads to a new ensemble model (forward additive model) in the direction of reduced residuals. The final model is the ensemble model that obtains the minimum root mean square error of prediction (RMSEP). The proposed method is applied to the quantitative prediction of ethanol concentrations. Over iterations 1–50, the average correlation coefficients of the calibration and validation sets are 0.9628 and 0.9388, and the average RMSE of cross-validation and RMSEP are 0.0732 and 0.0675, respectively. The overall performance of the proposed GD-Boosting-BiPLS method is compared with those of various ensemble strategies and 4 kinds of state-of-the-art spectral modeling methods. The experimental results reveal that the proposed method has the best generalization performance and stability.