Comparative Analysis of Machine Learning Algorithms for Kanji Character Recognition Using HOG Features
- Publicada
- Servidor
- Preprints.org
- DOI
- 10.20944/preprints202507.2038.v1
Japanese Kanji characters present significant challenges in the context of character recognition due to the complexity of their visual forms and the large number of classes. This study aims to compare the performance of four classical machine learning algorithms—Decision Tree, Random Forest, K-Nearest Neighbor (KNN), and Support Vector Machine (SVM)—in the task of Kanji character classification using the Histogram of Oriented Gradients (HOG) method as a consistent feature extraction technique. Experiments were conducted on a curated subset of the ETL9G dataset consisting of 30 randomly selected Kanji characters, with a total of 5,700 grayscale images. Each model was trained and evaluated using the K-Fold Cross-Validation method, and assessed based on accuracy, precision, recall, F1-score, and regression error metrics (R², MSE, MAE, RMSE). The results showed that the SVM algorithm with a linear kernel performed the best, with an accuracy of 97.43%, high inter-fold stability, and the lowest prediction error rate. Although KNN had the fastest training time, SVM showed better reliability and consistency of predictions. These findings confirm that, despite the increasing popularity of deep learning approaches, classical algorithms such as SVM remain highly competitive when combined with effective feature representations such as HOG. Future research could explore hybrid approaches that combine classical models with deep learning-based feature extractors to improve scalability and generalization on more complex datasets.