A Feature-Rich Vietnamese Named Entity Recognition Model

Quang Nhat Minh, Pham


Título del documento:	A Feature-Rich Vietnamese Named Entity Recognition Model
Revista:	Computación y sistemas
Base de datos:
Número de sistema:	000560735
ISSN:	1405-5546
Autores:	Quang Nhat Minh, Pham¹
Instituciones:	¹Alt Vietnam Co, Hanoi. Vietnam
Año:	2022
Periodo:	Jul-Sep
Volumen:	26
Número:	3
Paginación:	1323-1331
País:	México
Idioma:	Inglés
Tipo de documento:	Artículo
Resumen en inglés	In this paper, we present a feature-based named entity recognition (NER) model that achieves the start-of-the-art accuracy for Vietnamese language. We combine word, word-shape features, PoS, chunk, Brown-cluster-based features, and word-embedding-based features in the Conditional Random Fields (CRF) model. We also explore the effects of word segmentation, PoS tagging, and chunking results of many popular Vietnamese NLP toolkits on the accuracy of the proposed feature-based NER model. Up to now, our work is the first work that systematically performs an extrinsic evaluation of basic Vietnamese NLP toolkits on the downstream NER task. Experimental results show that while automatically-generated word segmentation is useful, PoS and chunking information generated by Vietnamese NLP tools does not show their benefits for the proposed feature-based NER model.
Disciplinas:	Literatura y lingüística, Ciencias de la computación
Palabras clave:	Lingüística aplicada, Programación
Keyword:	Applied linguistics, Programming
Texto completo:	Texto completo (Ver HTML) Texto completo (Ver PDF)

A Feature-Rich Vietnamese Named Entity Recognition Model

Espere un momento...