<span lang="EN-US" style="font-size: 12.0pt; mso-bidi-font-size: 10.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; color: black; mso-ansi-language: EN-US; mso-fareast-language: DE; mso-bidi-language: AR-SA;">Machine Learning Algorithms for Soil Pollution Source Detection: A Systematic Review
- Posted
- Server
- Preprints.org
- DOI
- 10.20944/preprints202509.0279.v1
Living organisms like plants, human beings, and microorganisms depend on pollution-free soil. Polluted soil has a great risk to the environment, especially agriculture, which provides livelihood to human beings. It also poses risks to the health of human beings all over the world. Tracking the source of soil contamination and determining the presence of contamination in soil best define environmental management and remediation. This systematic review identifies the application of machine learning (ML) algorithms in source identification and soil pollution categorization. Guided by the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) methodology, this study blends peer-reviewed papers published in the past ten years across various databases such as IEEE Xplore, ScienceDirect, SpringerLink, and Scopus. The review includes outstanding machine learning algorithms such as Support Vector Machines (SVM), Random Forest (RF), Decision Trees (DT), k-Nearest Neighbors (k-NN), Neural Networks (NN), and ensemble methods, their efficacy, accuracy, required data, and interpretability. It also identifies the types of input data commonly employed (e.g., geospatial, physicochemical, remote sensing) and the most common feature engineering and model optimization methods. Trends suggest that there is a growing drift towards hybrid and deep learning approaches despite ongoing issues with model generalizability, data availability, and deployment in field conditions. The review is concluded by discussing current research gaps and suggesting future directions for robust, interpretable, and scalable ML-based soil pollution source detection systems.