Saltar al contenido principal

Escribe una PREreview

Raman Spectroscopy of Protein-Polysaccharide Conjugates: A Comparative Study of Tree-Based Ensemble Models

Publicada
Servidor
Preprints.org
DOI
10.20944/preprints202512.2649.v1

Proteins with additives, especially in small quantities, are of great interest as a subject of a study. Machine learning approaches implemented to Raman spectroscopy data could provide an insight into chemical structure of such mixtures or conjugates. Although, de-cision tree model could be powerful in solving either classification or regression task and could provide accessible predictions, it is prone to overfitting. Ensemble models that implement several decision trees could overcome the determined problem. Five different model types are discussed: RandomForest, GradientBoosting, AdaBoost, Voting, and Stacking. Raman spectroscopy data of whey protein isolate (5 wt. %) with different amounts of hyaluronic acid (0, 0.1, 0.25, and 0.5 wt. %) were used as datasets. Optimiza-tion established that ensembles of 200 decision trees with a maximum depth of four were optimal. AdaBoostClassifier found to be the most efficient in finding differences between whey protein isolate and its conjugates with hyaluronic acid: 99.5% accuracy, 100% sen-sitivity, and 98.0% specificity. Stacking of RandomForest, GradientBoosting, and Ada-Boost regressors with final estimator of RidgeCV was the most effective approach in the regression task (R2 = 0.963). According to the feature importance plots, the Raman bands that were most influential in predicting the results were 1003 cm-1 (phenylalanine, ring breath), 1206 cm-1 (C–C stretching), 1240 cm-1 (amide III (β-sheet), N−H in-plane bend, C−N stretch), and 1399 cm-1 (aspartic and glutamic acids, C=O stretch of COO−).

Puedes escribir una PREreview de Raman Spectroscopy of Protein-Polysaccharide Conjugates: A Comparative Study of Tree-Based Ensemble Models. Una PREreview es una revisión de un preprint y puede variar desde unas pocas oraciones hasta un extenso informe, similar a un informe de revisión por pares organizado por una revista.

Antes de comenzar

Te pediremos que inicies sesión con tu ORCID iD. Si no tienes un iD, puedes crear uno.

¿Qué es un ORCID iD?

Un ORCID iD es un identificador único que te distingue de otros/as con tu mismo nombre o uno similar.

Comenzar ahora