Less is More, More or Less... Finding the Optimal Threshold for Lexicalization in Chunking



Título del documento: Less is More, More or Less... Finding the Optimal Threshold for Lexicalization in Chunking
Revista: Computación y sistemas
Base de datos: PERIÓDICA
Número de sistema: 000423319
ISSN: 1405-5546
Autores: 1
Instituciones: 1Pazmany Peter Catholic University, Faculty of Information Technology and Bionics, Budapest. Hungría
Año:
Periodo: Oct-Dic
Volumen: 21
Número: 4
País: México
Idioma: Inglés
Tipo de documento: Artículo
Enfoque: Analítico, descriptivo
Resumen en inglés Lexicalization of the input of sequential taggers has gone a long way since it was invented by Molina and Pla [4]. In this paper we thoroughly investigate the method introduced by Indig and Endrédy [2] to find out the best lexicalization level for chunking and to explore the behavior of different IOB representations. Both tasks are applied to the CoNLL-2000 dataset. Our goal is to introduce a transformation method to accommodate the parameters of the development set to the training set using their frequency distributions which other tasks like POS tagging or NER could benefit too
Disciplinas: Bibliotecología y ciencia de la información,
Literatura y lingüística
Palabras clave: Lingüística aplicada,
Análisis de textos,
Separación de frases,
Marcaje secuencial
Keyword: Applied linguistics,
Text analysis,
Sequential tagging,
Phrase chunking
Texto completo: Texto completo (Ver PDF)