Mining Ethnic Content Online with Additively Regularized Topic Models



Título del documento: Mining Ethnic Content Online with Additively Regularized Topic Models
Revista: Computación y sistemas
Base de datos: PERIÓDICA
Número de sistema: 000411058
ISSN: 1405-5546
Autors: 2
1
1
1
3
Institucions: 1National Research University, Higher School of Economics, San Petersburgo. Rusia
2Moscow State University, Moscú. Rusia
3Moscow institute of Physics and Technology, Moscú. Rusia
Any:
Període: Jul-Sep
Volum: 20
Número: 3
Paginació: 387-403
País: México
Idioma: Inglés
Tipo de documento: Artículo
Enfoque: Experimental, aplicado
Resumen en inglés Social studies of the Internet have adopted large-scale text mining for unsupervised discovery of topics related to specific subjects. A recently developed approach to topic modeling, additive regularization of topic models (ARTM), provides fast inference and more control over the topics with a wide variety of possible regularizers than developing LDA extensions. We apply ARTM to mining ethnic-related content from Russian-language blogosphere, introduce a new combined regularizer, and compare models derived from ARTM with LDA. We show with human evaluations that ARTM is better for mining topics on specific subjects, finding more relevant topics of higher or comparable quality
Disciplines Ciencias de la computación,
Literatura y lingüística
Paraules clau: Procesamiento de datos,
Lingüística aplicada,
Lingüística computacional,
Minería de texto,
Regularización aditiva
Keyword: Computer science,
Literature and linguistics,
Data processing,
Applied linguistics,
Computing linguistics,
Text mining,
Additive regularization
Text complet: Texto completo (Ver HTML)