Journal: | Polibits |
Database: | PERIÓDICA |
System number: | 000368140 |
ISSN: | 1870-9044 |
Authors: | Tomás, David1 Vicedo, José L1 Bisbal, Empar2 Moreno, Lidia2 |
Institutions: | 1Universidad de Alicante, Departamento de Software y Sistemas Computacionales, Alicante. España 2Universidad Técnica de Valencia, Departamento de Sistemas de Información y Computación, Valencia. España |
Year: | 2009 |
Season: | Jul-Dic |
Number: | 40 |
Country: | México |
Language: | Inglés |
Document type: | Artículo |
Approach: | Experimental, aplicado |
English abstract | This paper describes the development of an English corpus of factoid TREC–like question–answer pairs. The corpus obtained consists of more than 70,000 samples, containing each one the following information: a question, its question type, an exact answer to the question, the different contexts levels (sentence, paragraph and document) where the answer occurs inside a document, and a label indicating whether the answer is correct (a positive sample) or not (a negative sample). For instance, TrainQA can be used for training a binary classifier in order to decide if a given answer is correct (positive) to the question formulated or not (negative). To our knowledge, this is the first corpus aimed to train on every stage of a trainable Question Answering system: question classification, information retrieval, answer extraction and answer validation |
Disciplines: | Ciencias de la computación |
Keyword: | Sistemas de información, Sistemas expertos, Capacitación, Preguntas |
Keyword: | Computer science, Information systems, Expert systems, Training, Questions |
Full text: | Texto completo (Ver HTML) |