Revue: | Computación y sistemas |
Base de datos: | |
Número de sistema: | 000560487 |
ISSN: | 1405-5546 |
Autores: | Huetle Figueroa, Juan1 Pérez Téllez, Fernando1 Pinto, David2 |
Instituciones: | 1Institute of Technology Tallaght, Dublín. Irlanda 2Benemérita Universidad Autónoma de Puebla, Facultad de Ciencias de la Computación, Puebla. México |
Año: | 2020 |
Periodo: | Abr-Jun |
Volumen: | 24 |
Número: | 2 |
Paginación: | 651-668 |
País: | México |
Idioma: | Inglés |
Tipo de documento: | Artículo |
Resumen en inglés | The key terminology is very important for scientific works, especially for Natural Language Processing field. However, there is no optimal way to extract all the key terminology in a reliable manner. Thereby it is important to develop automatic methods for extracting key terms. This document presents a way to obtain the key terminology based on labels that were manually obtained by an expert in the area. Subsequently, we got POS (Part-of-the-speech) tags for each label, in which we obtained patterns from key terminology that were used as filters afterwards. Experiment 1 was tested using the labels obtained manually and the labels obtained by the proposed approach, with 60% of the corpus for training and 40% for tests. The patterns were evaluated with three different measures of evaluation such as precision, recall, and F-measure. Experiment 2 used three measures for ranking N-grams (sequence of terms), Point mutual information, Likelihood-ratio, and Chi-square. To obtain the best N-grams, we have implemented in experiment 3 intersections between the previous measures and filtering N-grams by POS patterns. Also, they were compared with the manually labeled set, evaluation measures were used to see its result, gave us a good recall moreover acceptable precision and F-measure. In experiment 4 POS patterns were tested in a much larger corpus of a different domain obtaining slightly higher results. |
Disciplinas: | Ciencias de la computación |
Palabras clave: | Inteligencia artificial |
Keyword: | Collocations, N-gramas, POS, Keyword extraction, Artificial intelligence |
Texte intégral: | Texto completo (Ver HTML) Texto completo (Ver PDF) |