Revue: | Polibits |
Base de datos: | PERIÓDICA |
Número de sistema: | 000368140 |
ISSN: | 1870-9044 |
Autores: | Tomás, David1 Vicedo, José L1 Bisbal, Empar2 Moreno, Lidia2 |
Instituciones: | 1Universidad de Alicante, Departamento de Software y Sistemas Computacionales, Alicante. España 2Universidad Técnica de Valencia, Departamento de Sistemas de Información y Computación, Valencia. España |
Año: | 2009 |
Periodo: | Jul-Dic |
Número: | 40 |
País: | México |
Idioma: | Inglés |
Tipo de documento: | Artículo |
Enfoque: | Experimental, aplicado |
Resumen en inglés | This paper describes the development of an English corpus of factoid TREC–like question–answer pairs. The corpus obtained consists of more than 70,000 samples, containing each one the following information: a question, its question type, an exact answer to the question, the different contexts levels (sentence, paragraph and document) where the answer occurs inside a document, and a label indicating whether the answer is correct (a positive sample) or not (a negative sample). For instance, TrainQA can be used for training a binary classifier in order to decide if a given answer is correct (positive) to the question formulated or not (negative). To our knowledge, this is the first corpus aimed to train on every stage of a trainable Question Answering system: question classification, information retrieval, answer extraction and answer validation |
Disciplinas: | Ciencias de la computación |
Palabras clave: | Sistemas de información, Sistemas expertos, Capacitación, Preguntas |
Keyword: | Computer science, Information systems, Expert systems, Training, Questions |
Texte intégral: | Texto completo (Ver HTML) |