Low Cost Construction of a Multilingual Lexicon from Bilingual Lists



Document title: Low Cost Construction of a Multilingual Lexicon from Bilingual Lists
Journal: Polibits
Database: PERIÓDICA
System number: 000359012
ISSN: 1870-9044
Authors: 1
1
1
Institutions: 1Multimedia University, Faculty of Information Technology, Cyberjaya. Malasia
Year:
Season: Ene-Jun
Number: 43
Country: México
Language: Inglés
Document type: Artículo
Approach: Analítico, descriptivo
English abstract Manually constructing multilingual translation lexicons can be very costly, both in terms of time and human effort. Although there have been many efforts at (semi–)automatically merging bilingual machine readable dictionaries to produce a multilingual lexicon, most of these approaches place quite specific requirements on the input bilingual resources. Unfortunately, not all bilingual dictionaries fulfil these criteria, especially in the case of under–resourced language pairs. We describe a low cost method for constructing a multilingual lexicon using only simple lists of bilingual translation mappings. The method is especially suitable for under–resourced language pairs, as such bilingual resources are often freely available and easily obtainable from the Internet, or digitised from simple, conventional paper–based dictionaries. The precision of random samples of the resultant multilingual lexicon is around 0.70–0.82, while coverage for each language, precision and recall can be controlled by varying threshold values. Given the very simple input resources, our results are encouraging, especially in incorporating under–resourced languages into multilingual lexical resources
Disciplines: Ciencias de la computación,
Literatura y lingüística
Keyword: Procesamiento de datos,
Lingüística aplicada,
Lingüística computacional,
Recursos léxicos,
Léxico multilingüe
Keyword: Computer science,
Literature and linguistics,
Data processing,
Applied linguistics,
Computing linguistics,
Lexical resources,
Multilingual lexicon
Full text: Texto completo (Ver HTML)