Inference of Fine-grained Attributes of Bengali Corpus for Stylometry Detection



Título del documento: Inference of Fine-grained Attributes of Bengali Corpus for Stylometry Detection
Revista: Polibits
Base de datos: PERIÓDICA
Número de sistema: 000355983
ISSN: 1870-9044
Autores: 1
1
Instituciones: 1Jadavpur University, Department of Computer Science and Engineering, Calcuta, Bengala Occidental. India
Año:
Número: 44
País: México
Idioma: Inglés
Tipo de documento: Artículo
Enfoque: Aplicado, descriptivo
Resumen en inglés Stylometry, the science of inferring characteristics of the author from the characteristics of documents written by that author, is a problem with a long history and belongs to the core task of Text categorization that involves authorship identification, plagiarism detection, forensic investigation, computer security, copyright and estáte disputes etc. In this work, we present a strategy for stylometry detection of documents written in Bengali. We adopt a set of fine–grained attribute features with a set of lexical markers for the analysis of the text and use three semi–supervised measures for making decisions. Finally, a majority voting approach has been taken for final classification. The system is fully automatic and language–independent. Evaluation results of our attempt for Bengali author' s stylometry detection show reasonably promising accuracy in comparison to the baseline model
Disciplinas: Ciencias de la computación
Palabras clave: Inteligencia artificial,
Análisis de textos,
Estilometría,
Marcadores de estilo,
Distancia euclideana
Keyword: Computer science,
Artificial intelligence,
Text analysis,
Stylometry,
Style markers,
Euclidean distance
Texto completo: Texto completo (Ver HTML)