Detecting Derivatives using Specific and Invariant Descriptors



Título del documento: Detecting Derivatives using Specific and Invariant Descriptors
Revista: Polibits
Base de datos: PERIÓDICA
Número de sistema: 000358995
ISSN: 1870-9044
Autores: 1
1
1
Instituciones: 1Universite de Nantes, Nantes, Loire-Atlantique. Francia
Año:
Periodo: Ene-Jun
Número: 43
País: México
Idioma: Inglés
Tipo de documento: Artículo
Enfoque: Analítico, descriptivo
Resumen en inglés This paper explores the detection of derivation links between texts (otherwise called plagiarism, near–duplication, revision, etc.) at the document level. We evaluate the use of textual elements implementing the ideas of specificity and invariance as well as their combination to characterize derivatives. We built a French press corpus based on Wikinews revisions to run this evaluation. We obtain performances similar to the state of the art method (n–grams overlap) while reducing the signature size and so, the processing costs. In order to ensure the verifiability and the reproducibility of our results we make our code as well as our corpus available to the community
Disciplinas: Ciencias de la computación,
Literatura y lingüística
Palabras clave: Procesamiento de datos,
Lingüística aplicada,
Lingüística computacional,
Análisis de documentos,
Derivaciones textuales,
Detección de derivaciones,
Cuasi-duplicados,
Plagio
Keyword: Computer science,
Literature and linguistics,
Data processing,
Applied linguistics,
Computing linguistics,
Document anlysis,
Textual derivatives,
Derivations detection,
Near-duplicates,
Plagiarism
Texto completo: Texto completo (Ver HTML)