MathIRs: Retrieval System for Scientific Documents



Título del documento: MathIRs: Retrieval System for Scientific Documents
Revue: Computación y sistemas
Base de datos: PERIÓDICA
Número de sistema: 000423237
ISSN: 1405-5546
Autores: 1
1
2
3
4
Instituciones: 1National Institute of Technology Mizoram, Mizoram, Aizawl. India
2Hijli College, Kharagpur, West Bengal. India
3Jadavpur University, Kolkata, West Bengal. India
4Instituto Politécnico Nacional, Centro de Investigación en Computación, Ciudad de México. México
Año:
Periodo: Abr-Jun
Volumen: 21
Número: 2
País: México
Idioma: Inglés
Tipo de documento: Artículo
Enfoque: Experimental, aplicado
Resumen en inglés Effective retrieval of mathematical contents from vast corpus of scientific documents demands enhancement in the conventional indexing and searching mechanisms. Indexing mechanism and the choice of semantic similarity measures guide the results of Math Information Retrieval system (MathIRs) to perfection. Tokenization and formula unification are among the distinguishing features of indexing mechanism, used in MathIRs, which facilitate sub-formula and similarity search. Besides, the scientific documents and the user queries in MathIRs will contain math as well as text contents and to match these contents we require three important modules: Text-Text Similarity (TS), Math-Math Similarity (MS) and Text-Math Similarity (TMS). In this paper we have proposed MathIRs comprising these important modules and a substitution tree based mechanism for indexing mathematical expressions. We have also presented experimental results for similarity search and argued that proposal of MathIRs will ease the task of scientific document retrieval
Disciplinas: Bibliotecología y ciencia de la información,
Ciencias de la computación
Palabras clave: Tecnología de la información,
Procesamiento de lenguaje natural,
Recuperación de información,
Indexación
Keyword: Information technology,
Natural language processing,
Information retrieval,
Indexing
Texte intégral: Texto completo (Ver HTML) Texto completo (Ver PDF)