Journal: | Computación y sistemas |
Database: | |
System number: | 000560487 |
ISSN: | 1405-5546 |
Authors: | Huetle Figueroa, Juan1 Pérez Téllez, Fernando1 Pinto, David2 |
Institutions: | 1Institute of Technology Tallaght, Dublín. Irlanda 2Benemérita Universidad Autónoma de Puebla, Facultad de Ciencias de la Computación, Puebla. México |
Year: | 2020 |
Season: | Abr-Jun |
Volumen: | 24 |
Number: | 2 |
Pages: | 651-668 |
Country: | México |
Language: | Inglés |
Document type: | Artículo |
English abstract | The key terminology is very important for scientific works, especially for Natural Language Processing field. However, there is no optimal way to extract all the key terminology in a reliable manner. Thereby it is important to develop automatic methods for extracting key terms. This document presents a way to obtain the key terminology based on labels that were manually obtained by an expert in the area. Subsequently, we got POS (Part-of-the-speech) tags for each label, in which we obtained patterns from key terminology that were used as filters afterwards. Experiment 1 was tested using the labels obtained manually and the labels obtained by the proposed approach, with 60% of the corpus for training and 40% for tests. The patterns were evaluated with three different measures of evaluation such as precision, recall, and F-measure. Experiment 2 used three measures for ranking N-grams (sequence of terms), Point mutual information, Likelihood-ratio, and Chi-square. To obtain the best N-grams, we have implemented in experiment 3 intersections between the previous measures and filtering N-grams by POS patterns. Also, they were compared with the manually labeled set, evaluation measures were used to see its result, gave us a good recall moreover acceptable precision and F-measure. In experiment 4 POS patterns were tested in a much larger corpus of a different domain obtaining slightly higher results. |
Disciplines: | Ciencias de la computación |
Keyword: | Inteligencia artificial |
Keyword: | Collocations, N-gramas, POS, Keyword extraction, Artificial intelligence |
Full text: | Texto completo (Ver HTML) Texto completo (Ver PDF) |