Skip to main content

Write a PREreview

Lung Disease Detection Using Multimodal Machine Learning with Structured Clinical Data and Medical Images

Posted
Server
Preprints.org
DOI
10.20944/preprints202603.2177.v1

Lung disease is a major global health challenge causing millions of deaths annually. Early diagnosis and treatment of lung disease is crucial for effective treatment, preventing mortality and reducing long-term morbidity. While most existing diagnostic research primarily utilizes unimodal medical image data, this approach often provides limited information. To incorporate additional clinical information in the diagnosis, multimodal strategies are increasingly being explored. Medical image and clinical data are the key medical information utilized by physicians to diagnose lung disease in addition to physical examination. In this work, we propose a comprehensive multimodal machine learning framework for lung disease detection that integrates structured clinical data with medical imaging modalities specifically, chest X-rays and computed tomography scans. The methodology includes robust data preprocessing, feature extraction using VGG16 for images and multiple techniques (mutual information, principal component analysis, and random forest) for clinical data, followed by fusion and classification using both classical machine learning and deep learning models. We introduce and evaluate a newly collected lung disease dataset comprising over 27,635 records combining imaging and clinical data from Ethiopian hospitals. Experiments conducted show that unimodal chest X-Ray image based detection achieves 95.28% accuracy while multimodal chest X-ray and clinical data based detection achieves accuracy 98.88%. Similar results are obtained for computed tomography scan based experiments with 97.62% for unimodal and 98.91% for multimodal detection. This study demonstrated the critical importance of multimodal data fusion in developing more accurate and clinically viable diagnostic system for lung diseases.

You can write a PREreview of Lung Disease Detection Using Multimodal Machine Learning with Structured Clinical Data and Medical Images. A PREreview is a review of a preprint and can vary from a few sentences to a lengthy report, similar to a journal-organized peer-review report.

Before you start

We will ask you to log in with your ORCID iD. If you don’t have an iD, you can create one.

What is an ORCID iD?

An ORCID iD is a unique identifier that distinguishes you from everyone with the same or similar name.

Start now