Revista: | Computación y sistemas |
Base de datos: | PERIÓDICA |
Número de sistema: | 000411058 |
ISSN: | 1405-5546 |
Autors: | Apishev, Murat2 Koltcov, Sergei1 Koltsova, Olessia1 Nikolenko, Sergey1 Vorontsov, Konstantin3 |
Institucions: | 1National Research University, Higher School of Economics, San Petersburgo. Rusia 2Moscow State University, Moscú. Rusia 3Moscow institute of Physics and Technology, Moscú. Rusia |
Any: | 2016 |
Període: | Jul-Sep |
Volum: | 20 |
Número: | 3 |
Paginació: | 387-403 |
País: | México |
Idioma: | Inglés |
Tipo de documento: | Artículo |
Enfoque: | Experimental, aplicado |
Resumen en inglés | Social studies of the Internet have adopted large-scale text mining for unsupervised discovery of topics related to specific subjects. A recently developed approach to topic modeling, additive regularization of topic models (ARTM), provides fast inference and more control over the topics with a wide variety of possible regularizers than developing LDA extensions. We apply ARTM to mining ethnic-related content from Russian-language blogosphere, introduce a new combined regularizer, and compare models derived from ARTM with LDA. We show with human evaluations that ARTM is better for mining topics on specific subjects, finding more relevant topics of higher or comparable quality |
Disciplines | Ciencias de la computación, Literatura y lingüística |
Paraules clau: | Procesamiento de datos, Lingüística aplicada, Lingüística computacional, Minería de texto, Regularización aditiva |
Keyword: | Computer science, Literature and linguistics, Data processing, Applied linguistics, Computing linguistics, Text mining, Additive regularization |
Text complet: | Texto completo (Ver HTML) |