This page describes the instructions of the RITE-2 formal run, so please carefully read.
Formal run period is between January 9th (00:00 UTC-12) and January 16th (23:59 UTC-12). Please note that the submission page will be closed at January 16 (23:59 UTC-12).
The following are the names of subtask or dataset in the RITE-2 formal run. When you submit the formal run results, please check the corresponding items in the submission page.
Japanese subtasks (JA)
Simplified Chinese subtasks (CS)
Dataset | Number of Pairs | ID Range | |
---|---|---|---|
BC | 781 pairs | id 1~781 | |
MC | 781 pairs | id 1~781 | |
RITE4QA | 2511 pairs | id 1~7724, not continuous | Configurations of RITE4QA submissions should correspond to BC system configurations |
ExtraBC | 1387 pairs | id 1~1894, not continuous | BC Participants should also submit the result of this dataset. Configurations of ExtraBC submissions should correspond to BC system configurations. |
ExtraMC | 1387 pairs | id 1~1894, not continuous | MC Participants should also submit the result of this dataset. Configurations of ExtraMC submissions should correspond to MC system configurations. |
OptionalRITE4QA | 5256 pairs | id 21~7767, not continuous | Considering RITE4QA participants may not have enough time to do all RITE4QA pairs, we set out low QA score pairs to make this optional RITE4QA dataset. Participants can decide if they want to submit results for this dataset. Configurations of OptionalRITE4QA submissions should correspond to BC system configurations. |
Traditional Chinese datasets (CT)
Dataset | Number of Pairs | ID Range | |
---|---|---|---|
BC | 881 pairs | id 1~881 | |
MC | 881 pairs | id 1~881 | |
RITE4QA | 2511 pairs | id 1~7724, not continuous | Configurations of RITE4QA submissions should correspond to BC system configurations |
ExtraBC | 1894 pairs | id 1~1894 | BC Participants should also submit the result of this dataset. Configurations of ExtraBC submissions should correspond to BC system configurations. |
ExtraMC | 1894 pairs | id 1~1894 | MC Participants should also submit the result of this dataset. Configurations of ExtraMC submissions should correspond to MC system configurations. |
OptionalRITE4QA | 5256 pairs | id 21~7767, not continuous | Considering RITE4QA participants may not have enough time to do all RITE4QA pairs, we set out low QA score pairs to make this optional RITE4QA dataset. Participants can decide if they want to submit results for this dataset. Configurations of OptionalRITE4QA submissions should correspond to BC system configurations. |
For each test dataset, you can submit up to three runs.
In terms of the data format for the formal run, please carefully read Submission Format, and check the format of your result files using Evaluation Tool. Note that your submission is rejected if your submitted file is not accepted by the evaluation tool.
Please stop developing your system after distributing the formal run data. (a minimal modification related to I/O or a bug fix are allowed.) Especially, please do not develop resources by hand after looking into the formal run data. The following are some cases judged as manual intervention or not manual intervention.
Please mail to the RITE-2 mailing list (ntcir10-rite2 at googlegroups.com) if you have any question about this point.
When making formal run results of a subtask, please do not use the dataset of the other subtasks. For example, in making the formal run of Exam Search, using the formal run data of Exam BC is prohibited.
You can upload the results for the formal run from the following site (EasyChair). The system description paper will be also submitted from this site. All of the results for the formal run tasks you participated should be uploaded together.
formal run submission page:
https://www.easychair.org/conferences/?conf=ntcir10rite2
In RITE-2, we decided to use EasyChair for submitting both formal run results and system description papers. When submitting formal run results, please specify the zipped result file in the part “PDF”. For submitting your formal run results, please carefully read How to submit your formal run results and How to re-submit your formal run results. Please mail to the RITE-2 mailing list (ntcir10-rite2 at googlegroups.com) if you have any questions.
You can submit up to three runs for each subtask. Please zip all of the results together with a system description. The filename of your formal run data should be RITE2-(TeamID)-(Lang)-(SubtaskName)-(RunNumber).txt.
Let us consider a case where TeamID is TOHOK and participating tasks are JA-BC and JA-ExamBC. Then the files you should prepare are as follows.
+--TOHOK | +--RITE2-TOHOK-description.txt | +--RITE2-TOHOK-JA-BC-01.txt | +--RITE2-TOHOK-JA-BC-02.txt | +--RITE2-TOHOK-JA-BC-03.txt | +--RITE2-TOHOK-JA-ExamBC-01.txt | +--RITE2-TOHOK-JA-ExamBC-02.txt | +--RITE2-TOHOK-JA-ExamBC-03.txt
After zipping all of the files as TOHOK.zip, please send the zipped file from the EasyChair site. For more details, see submit your results or re-submit your results. In the ExamSearch subtask, if you intend to submit the search results of t1, please set the filename as RITE2-(TeamID)-(Lang)-(SubtaskName)-(RunNumber)-ID.txt. For example, if you send three runs together with their search results of t1, then the files will be as follows.
+--TOHOK | +--RITE2-TOHOK-description.txt | +--RITE2-TOHOK-JA-ExamSearch-01.txt | +--RITE2-TOHOK-JA-ExamSearch-01-ID.txt | +--RITE2-TOHOK-JA-ExamSearch-02.txt | +--RITE2-TOHOK-JA-ExamSearch-02-ID.txt | +--RITE2-TOHOK-JA-ExamSearch-03.txt | +--RITE2-TOHOK-JA-ExamSearch-03-ID.txt
In the description file, please write detailed description of your system used in the runs for each subtasks you participated. The description must follow the template. The information you wrote will be used to write the RITE-2 overview paper.
More precisely, for each question, only applicable items should be left.
TOHOK-JA-BC-01 system: Extracted a set of morphemes from t1 and t2, and decide on the label based on the percentage of overlap between the sets of morphemes. questionnaire: 1. what approach your system can be categorized into? * statistical (e.g. machine learning) 2. fundamental approach of your system (you can choose more than one) * overlap ... TOHOK-JA-BC-02 system: Extracted a set of morphemes from t1 and t2, and decide on the label based on the percentage of overlap between the sets of morphemes. questionnaire: 1. what approach your system can be categorized into? * statistical (e.g. machine learning) 2. fundamental approach of your system (you can choose more than one) * overlap ...
Each result of the formal run will be evaluated by Evaluation Tool. Please carefully read Submission Format and check the format of the files for submission using the Evaluation Tool.
In the period between January 9th (00:00 UTC-12) and January 16th (23:59 UTC-12), you can submit the results as many time as you want. The submitted file at last will be registered as the official results.
This section describes a process of a new submission. If you intend to re-submit the results, please see How to re-submit your formal run results.