Low Cost Construction of a Multilingual Lexicon from Bilingual Lists



Título del documento: Low Cost Construction of a Multilingual Lexicon from Bilingual Lists
Revista: Polibits
Base de datos: PERIÓDICA
Número de sistema: 000359012
ISSN: 1870-9044
Autores: 1
1
1
Instituciones: 1Multimedia University, Faculty of Information Technology, Cyberjaya. Malasia
Año:
Periodo: Ene-Jun
Número: 43
País: México
Idioma: Inglés
Tipo de documento: Artículo
Enfoque: Analítico, descriptivo
Resumen en inglés Manually constructing multilingual translation lexicons can be very costly, both in terms of time and human effort. Although there have been many efforts at (semi–)automatically merging bilingual machine readable dictionaries to produce a multilingual lexicon, most of these approaches place quite specific requirements on the input bilingual resources. Unfortunately, not all bilingual dictionaries fulfil these criteria, especially in the case of under–resourced language pairs. We describe a low cost method for constructing a multilingual lexicon using only simple lists of bilingual translation mappings. The method is especially suitable for under–resourced language pairs, as such bilingual resources are often freely available and easily obtainable from the Internet, or digitised from simple, conventional paper–based dictionaries. The precision of random samples of the resultant multilingual lexicon is around 0.70–0.82, while coverage for each language, precision and recall can be controlled by varying threshold values. Given the very simple input resources, our results are encouraging, especially in incorporating under–resourced languages into multilingual lexical resources
Disciplinas: Ciencias de la computación,
Literatura y lingüística
Palabras clave: Procesamiento de datos,
Lingüística aplicada,
Lingüística computacional,
Recursos léxicos,
Léxico multilingüe
Keyword: Computer science,
Literature and linguistics,
Data processing,
Applied linguistics,
Computing linguistics,
Lexical resources,
Multilingual lexicon
Texto completo: Texto completo (Ver HTML)