Revista: | Computación y sistemas |
Base de datos: | PERIÓDICA |
Número de sistema: | 000423319 |
ISSN: | 1405-5546 |
Autores: | Indig, Balázs1 |
Instituciones: | 1Pazmany Peter Catholic University, Faculty of Information Technology and Bionics, Budapest. Hungría |
Año: | 2017 |
Periodo: | Oct-Dic |
Volumen: | 21 |
Número: | 4 |
País: | México |
Idioma: | Inglés |
Tipo de documento: | Artículo |
Enfoque: | Analítico, descriptivo |
Resumen en inglés | Lexicalization of the input of sequential taggers has gone a long way since it was invented by Molina and Pla [4]. In this paper we thoroughly investigate the method introduced by Indig and Endrédy [2] to find out the best lexicalization level for chunking and to explore the behavior of different IOB representations. Both tasks are applied to the CoNLL-2000 dataset. Our goal is to introduce a transformation method to accommodate the parameters of the development set to the training set using their frequency distributions which other tasks like POS tagging or NER could benefit too |
Disciplinas: | Bibliotecología y ciencia de la información, Literatura y lingüística |
Palabras clave: | Lingüística aplicada, Análisis de textos, Separación de frases, Marcaje secuencial |
Keyword: | Applied linguistics, Text analysis, Sequential tagging, Phrase chunking |
Texto completo: | Texto completo (Ver PDF) |