Knowledge Expansion of a Statistical Machine Translation System using Morphological Resources



Document title: Knowledge Expansion of a Statistical Machine Translation System using Morphological Resources
Journal: Polibits
Database: PERIÓDICA
System number: 000359002
ISSN: 1870-9044
Authors: 1
1
Institutions: 1GlobSec, European Commission, Joint Research Centre, Ispra, Varese. Italia
Year:
Season: Ene-Jun
Number: 43
Country: México
Language: Inglés
Document type: Artículo
Approach: Analítico, descriptivo
English abstract Translation capability of a Phrase–Based Statistical Machine Translation (PBSMT) system mostly depends on parallel data and phrases that are not present in the training data are not correctly translated. This paper describes a method that efflciently expands the existing knowledge of a PBSMT system without adding more parallel data but using external morphological resources. A set of new phrase associations is added to translation and reordering models; each of them corresponds to a morphological variation of the source/target/both phrases of an existing association. New associations are generated using a string similarity score based on morphosyntactic information. We tested our approach on En–Fr and Fr–En translations and results showed improvements of the performance in terms of automatic scores (BLEU and Meteor) and reduction of out–of–vocabulary (OOV) words. We believe that our knowledge expansion framework is generic and could be used to add different types of information to the model
Disciplines: Ciencias de la computación,
Literatura y lingüística
Keyword: Procesamiento de datos,
Lingüística aplicada,
Lingüística computacional,
Sistemas de traducción,
Aprendizaje de máquinas,
Morfosintaxis
Keyword: Computer science,
Literature and linguistics,
Data processing,
Applied linguistics,
Computing linguistics,
Translation systems,
Machine learning,
Morphosyntax
Full text: Texto completo (Ver HTML)