Morpheme based Language Model for Tamil Part-of-Speech Tagging



Título del documento: Morpheme based Language Model for Tamil Part-of-Speech Tagging
Revue: Polibits
Base de datos: PERIÓDICA
Número de sistema: 000368517
ISSN: 1870-9044
Autores: 1
1
Instituciones: 1Anna University, Department of Computer Science and Engineering, Chennai, Tamil Nadu. India
Año:
Periodo: Jul-Dic
Número: 38
País: México
Idioma: Inglés
Tipo de documento: Artículo
Enfoque: Experimental, aplicado
Resumen en inglés The paper describes a Tamil Part of Speech (POS) tagging using a corpus–based approach by formulating a Language Model using morpheme components of words. Rule based tagging, Markov model taggers, Hidden Markov Model taggers and transformation–based learning tagger are some of the methods available for part of speech tagging. In this paper, we present a language model based on the information of the stem type, last morpheme, and previous to the last morpheme part of the word for categorizing its part of speech. For estimating the contribution factors of the model, we follow generalized iterative scaling technique. Presented model has the overall F–measure of 96%
Disciplinas: Ciencias de la computación
Palabras clave: Procesamiento de datos,
Lingüística computacional,
Lenguaje,
Morfema,
Procesos de Markov
Keyword: Computer science,
Data processing,
Computing linguistics,
Markov processes,
Language,
Morpheme
Texte intégral: Texto completo (Ver HTML)