Questions are simple text files structured as the following
example:
| Task identifier |
Question source language |
Question identifier |
Question string |
M|C
where M stands for Monolingual and C for Cross-language |
ITA|SPA|DUT|GER|FRE |
four
digits |
string |
for example:
M ITA
0023 Qual'θ
la capitale del Marocco?
another example is:
C GER
0124 Wieviele
Einwohner hat Berlin?
Each task requires to answer 200 questions. Below you can
download the queries for the task(s) you participate in:
MONOLINGUAL TASKS:
-
Download here the questions for the monolingual
Dutch QA
-
Download here the questions for the monolingual
Italian QA
-
Download here the questions for the monolingual
Spanish QA
CROSS-LANGUAGE TASKS:
-
Download here the questions for the bilingual
Dutch QA
-
Download here the questions for the bilingual
French QA
-
Download here the questions for the bilingual
German QA
-
Download here the questions for the bilingual
Italian QA
-
Download here the questions for the bilingual
Spanish QA
MAPPING FILES:
The monolingual and bilingual test sets were
extracted from two collections of questions and answers (now publicly available
for research purposes). Questions in the test sets and in the gold standard
collections are numbered differently: we provide here the mapping files to go
back to the original number.
FORMAT EXAMPLE:
5 =>
2 Chi e' il presidente della FIAT?
<TEST
SET QUESTION NUMBER> => <GOLD STANDARD
QUESTION NUMBER> <QUESTION STRING>
SUBMISSIONS
FORMAT
Questions for all the tasks will be available on May 7 at 12 a.m. (noon).
Results are due back by 12 p.m (midnight) of May 15.
Participants will be given a password to upload their
submissions, and before the beginning of the track, a checking routine will be released on this
website. Before submitting their results, participants should use the routine to make sure that their submissions are well structured and
formulated. The routine will detect mistakes such as invalid document numbers, wrong
formats, missing data, etc.
Concerning the required format of the answers (that has been already explained in the
guidelines), each submission is a single file. Questions must be returned unranked (i.e.
from 1 to 200, in the same order as they have been
downloaded). On the other hand, the ranking of each answer (values 1,2 or
3) up to 3 responses per query are allowed must be in ascending (increasing)
order. No ranking is required for the score number. If your system does not produce any score number, a default 0 must be placed in the column of the score.
The name of the submitted files must be the same as the RUN-TAG IDENTIFIER
(second column in the answer format) with ".txt" extension.
With reference to EXAMPLE 1 below, the submitted file name is:
irstex031mi.txt
Answers must be structured in six columns, as follows:
EXAMPLE 1:
1 irstex031mi 1 4057 LASTAMP94-001102 new york
1 irstex031mi 2 3166 AGZ.941207.0327 Boston
1 irstex031mi 3 233 LASTAMPA94-000506 chicago
...
EXAMPLE 2:
...
18 irstex031bi 1 1244 NIL
19 irstex031bi 1 981 LA020194.0425 tom cruise
...
EXAMPLE 3:
...
23 irstst031bi 1 4429 LA030474.3285 Orwells Animal Farm represents a good example
...
where:
- the first column is the question number;
- the second column is the run-tag, that contains many information:
* the first 4 letters identify the group that created the system
* the next two letters must be ex if the run produced exact
answers, or st if it produced 50 bytes long answers
* 03 stands for the year 2003
* the following digit must be 1 if that is the first
run, or 2 if it is the second one. (we remind you that up to two runs are
allowed and that each run must be submitted as a single file)
* the last two letters describe the task: the first one must be whether m
(monolingual) or b (bilingual), while the second one identifies the language ( i for
Italian, d for Dutch, s for Spanish, g for German and f for
French.
So, the run-tag column must have the following structure, WITHOUT punctuation: [a-z][a-z][a-z][a-z](ex|st)03(1|2)(mi|md|ms|bi|bd|bs|bg|bf|).
On the whole, it must be 11 characters long.
- the third column is the answer ranking, that is crucially important for the
evaluation. Correct answers will be marked taking into account their ranking. Since up to three answers per question are
allowed, ranking number can be 1, 2 or 3. It must be
sorted in ascending order.
- the fourth column is the score number, that is not compulsory; it
can be an integer or a floating point number represented with up to 8 characters. If your system does not produce any score
number, it must return a default score equal to 0.
- the fifth column is the docid that supports the given answer. Note that if a question has not a known answer in the document collection (i.e. the right answer is
NIL), the string NIL replaces the docid, and the answer string must be
empty.
- The sixth and last column is the answer string (empty if the docid is
NIL). Note that there are two kinds of answer: the exact answer and the 50 bytes long string. Participants may submit up to two
runs, and they can be both exact answers, both 50 bytes strings or one run for each
modality, but it is NOT allowed in any way to mix both modalities within the same
run.
Generally speaking, participants should follow the all columns must be
present rule, except when the fifth column is NIL. There must be at least one
space between columns, but make sure your submission lines are not longer than 1024
bytes. There should be a single line-break after each answer string, so that the next answer item starts on the very next line.
CLEF2003
QUESTION ANSWERING
TRACK RESULTS NEW
Results for both monolingual and biligual tasks are now available.
Click here to download the summary statistics for each
run