Revue: | Computación y sistemas |
Base de datos: | PERIÓDICA |
Número de sistema: | 000411055 |
ISSN: | 1405-5546 |
Autores: | Sierra Martínez, Luz Marina1 Cobos Lozada, Carlos Alberto2 Corrales, Juan Carlos1 |
Instituciones: | 1Universidad del Cauca, Grupo de Investigación en Ingeniería Telemática, Popayán, Cauca. Colombia 2Universidad del Cauca, Grupo de Investigación en Tecnología de la Información, Popayán, Cauca. Colombia |
Año: | 2016 |
Periodo: | Jul-Sep |
Volumen: | 20 |
Número: | 3 |
Paginación: | 355-364 |
País: | México |
Idioma: | Inglés |
Tipo de documento: | Artículo |
Enfoque: | Experimental, aplicado |
Resumen en inglés | In Colombia, ethnic and cultural diversity is conceived by the government to be a social right. Such diversity finds expression, among other ways, in a large number of indigenous languages, which have been kept alive for centuries. However, efforts toward conservation and preservation of these languages have generally fallen short. This is the case for the Nasa Yuwe language, spoken by the Nasa, or Páez, indigenous community, the status of which is endangered. Given such a predicament, the use of technology has been found to provide a strategic opportunity for adaptation, ownership, and development of Nasa Yuwe within the social and cultural environment of the Nasa people. The technology includes the use of computational techniques, which allow the exchange of information by means of IR activities. These encourage different, new possibilities for the Nasa people to be able to interact in Nasa Yuwe. It has therefore become necessary to adapt the stages of the IR process to this language. The current paper specifically presents a process for adapting a tokenizer to texts written in Nasa Yuwe. This involves making use of the precision-recall curve as an evaluation and comparison measure. The results presented allow appreciation of all stages in the process of adapting the standard tokenizer to produce the Nasa version, of the Nasa tokenizer and its results over texts written in Nasa Yuwe, and of the analysis of the precision-recall curve baseline in contrast to that of the Nasa tokenizer |
Disciplinas: | Ciencias de la computación, Literatura y lingüística |
Palabras clave: | Procesamiento de datos, Lingüística aplicada, Lingüística computacional, Lenguas indígenas, Análisis de textos, Recuperación de información |
Keyword: | Computer science, Literature and linguistics, Data processing, Applied linguistics, Computing linguistics, Indigenous languages, Text analysis, Information retrieval |
Texte intégral: | Texto completo (Ver HTML) |