Revue: | Computación y Sistemas |
Base de datos: | PERIÓDICA |
Número de sistema: | 000423275 |
ISSN: | 1405-5546 |
Autores: | Rojas Simón, Jonathan1 Ledeneva, Yulia1 García Hernández, René Arnulfo1 |
Instituciones: | 1Universidad Autónoma del Estado de México, Unidad Académica Profesional Tianguistenco, Santiago Tianguistenco, Estado de México. México |
Año: | 2018 |
Periodo: | Ene-Mar |
Volumen: | 22 |
Número: | 1 |
País: | México |
Idioma: | Inglés |
Tipo de documento: | Artículo |
Enfoque: | Aplicado, descriptivo |
Resumen en inglés | Over the last years, several Multi-Document Summarization (MDS) methods have been presented in Document Understanding Conference (DUC), workshops. Since DUC01, several methods have been presented in approximately 268 publications of the state-of-the-art, that have allowed the continuous improvement of MDS, however in most works the upper bounds were unknowns. Recently, some works have been focused to calculate the best sentence combinations of a set of documents and in previous works we have been calculated the significance for single-document summarization task in DUC01 and DUC02 datasets. However, for MDS task has not performed an analysis of significance to rank the best multi-document summarization methods. In this paper, we describe a Genetic Algorithm-based method for calculating the best sentence combinations of DUC01 and DUC02 datasets in MDS through a Meta-document representation. Moreover, we have calculated three heuristics mentioned in several works of state-of-the-art to rank the most recent MDS methods, through the calculus of upper bounds and lower bounds |
Disciplinas: | Bibliotecología y ciencia de la información, Ciencias de la computación |
Palabras clave: | Análisis y sistematización de la información, Procesamiento de lenguaje natural, Encabezamientos, Integración de documentos, Algoritmos genéticos, Significancia |
Keyword: | Information analysis, Natural language processing, Toplines, Document summarization, Genetic algorithms, Significance |
Texte intégral: | Texto completo (Ver HTML) Texto completo (Ver PDF) |