The Impact of Training Methods on the Development of Pre-Trained Language Models

Uribe, Diego; Cuan, Enrique; Urquizo, Elisa


Título del documento:	The Impact of Training Methods on the Development of Pre-Trained Language Models
Revista:	Computación y sistemas
Base de datos:
Número de sistema:	000607871
ISSN:	1405-5546
Autores:	Uribe, Diego¹ Cuan, Enrique¹ Urquizo, Elisa¹
Instituciones:	¹Instituto Tecnológico de la Laguna, Coahuila. México
Año:	2024
Periodo:	Ene-Mar
Volumen:	28
Número:	1
Paginación:	109-124
País:	México
Idioma:	Inglés
Resumen en inglés	The focus of this work is to analyze the implications of pre-training tasks in the development of language models for learning linguistic representations. In particular, we study three pre-trained BERT models and their corresponding unsupervised training tasks (e.g., MLM, Distillation, etc.). To consider similarities and differences, we fine-tune these language representation models on the classification task of four different categories of short answer responses. This fine-tuning process is implemented with two different neural architectures: with just one additional output layer and with a multilayer perceptron. In this way, we enrich the comparison of the pre-trained BERT models from three perspectives: the pre-training tasks in the development of language models, the fine-tuning process with different neural architectures, and the computational cost demanded on the classification of short answer responses.
Keyword:	Language models, Pre-training tasks, BERT, Fine-tuning
Texto completo:	Texto completo (Ver PDF) Texto completo (Ver HTML)

The Impact of Training Methods on the Development of Pre-Trained Language Models

Espere un momento...