Breast, Lung and Liver Cancer Classification from Structured and Unstructured Data



Título del documento: Breast, Lung and Liver Cancer Classification from Structured and Unstructured Data
Revista: Computación y sistemas
Base de datos:
Número de sistema: 000560644
ISSN: 1405-5546
Autores: 1
1
1
Instituciones: 1Universidad Autónoma Metropolitana, Departamento de Sistemas, Azcapotzalco, Ciudad de México. México
Año:
Periodo: Ene-Mar
Volumen: 26
Número: 1
Paginación: 233-243
País: México
Idioma: Inglés
Tipo de documento: Artículo
Resumen en inglés Currently, cancer is a worldwide public health problem. Machine and deep learning techniques hold great promise in healthcare by analyzing Electronic Health Records (EHR) that contain a large collection of structured and unstructured data. However, most research has been done with structured data, and valuable data is also found in doctor’s plain-text notes. Thus, this paper proposes an approach to classify breast, liver, and lung cancer based on structured and unstructured data obtained from the MIMIC-II clinical database by using machine and deep learning techniques. In particular, the Paragraph Vector algorithm is used as a deep learning approach to text representation. The goal of this work is to help physicians in early diagnosis of cancer. The proposed approach was tested on a balanced dataset of breast, liver, and lung cancer patient records. Pre-processing is done with structured and unstructured data, and the result is used as input variables to three machine learning models: Support Vector Machines, Multi Layer Perceptron, and Adaboost-SAMME. Then, the scoring metrics for these models are calculated in different training data configurations to choose the best performing model for classification. Results show that the best performing model was obtained with MLP, achieving 89% precision using unstructured data.
Disciplinas: Ciencias de la computación
Palabras clave: Inteligencia artificial
Keyword: Artificial intelligence
Texto completo: Texto completo (Ver HTML) Texto completo (Ver PDF)