wiki:submission_ct

Formal Run

此頁面詳述了提交RITE-2 Formal Run結果的方式,請仔細閱讀。(注意: 若此頁面內容與其他語言版本不一致,請以英文版為準。)

Formal run進行時間為2013/1/9(00:00 UTC-12)至2013/1/16(23:59 UTC-12)。

提交頁面關閉時間: 2013/1/16 (23:59 UTC-12)

資料集清單

以下為RITE-2 formal run資料集。當您提交formal run資料集結果時,請記得在提交表單上勾選相對應的Topic。

日文資料集 (JA)

  • BC
  • MC
  • ExamBC
  • ExamSearch
  • UnitTest

簡中資料集 (CS)

資料集 資料對數 ID範圍
BC 781 pairs id 1~781
MC 781 pairs id 1~781
RITE4QA 2511 pairs id 1~7724, 不連續編號 請採用與BC相同的系統組態來處理這組資料
ExtraBC 1387 pairs id 1~1894, 不連續編號 有參與BC subtask的隊伍請額外提交這組資料的結果,並且請採用與BC相同的系統組態來處理這組資料
ExtraMC 1387 pairs id 1~1894, 不連續編號 有參與MC subtask的隊伍請額外提交這組資料的結果,並且請採用與MC相同的系統組態來處理這組資料
OptionalRITE4QA 5256 pairs id 21~7767, 不連續編號 本次RITE4QA完整資料集過於龐大,考量到可能有隊伍無法在原訂的時間內跑完全部資料,我們額外切分出組資料集。參與的團隊請自行視狀況決定是否要提交此資料集的結果。請採用與BC相同的系統組態來處理這組資料

繁中資料集 (CT)

資料集 資料對數 ID範圍
BC 881 pairs id 1~881
MC 881 pairs id 1~881
RITE4QA 2511 pairs id 1~7724, 不連續編號 請採用與BC相同的系統組態來處理這組資料
ExtraBC 1894 pairs id 1~1894 有參與BC subtask的隊伍請額外提交這組資料的結果,並且請採用與BC相同的系統組態來處理這組資料
ExtraMC 1894 pairs id 1~1894 有參與MC subtask的隊伍請額外提交這組資料的結果,並且請採用與MC相同的系統組態來處理這組資料
OptionalRITE4QA 5256 pairs id 21~7767, 不連續編號 本次RITE4QA完整資料集過於龐大,考量到可能有隊伍無法在原訂的時間內跑完全部資料,我們額外切分出組資料集。參與的團隊請自行視狀況決定是否要提交此資料集的結果。請採用與BC相同的系統組態來處理這組資料

提交數量

每一組資料集,最多三份結果。請注意相對應的資料集有沒有採用相同的組態。比方說某團隊在BC資料集上嘗試了三種組態(config1, config2, config3),產生了三份結果(RITE2-TeamX-CT-BC-01.txt, RITE2-TeamX-CT-BC-02.txt, RITE2-TeamX-CT-BC-03.txt)。則這三個組態必須同時套用在ExtraBC, RITE4QA, 以及OptionalRITE4QA這三個BC形式的資料集上。

最後送出的結果會如下所示:

資料集 結果 系統組態
BC RITE2-TeamX-CT-BC-01.txt config1
RITE2-TeamX-CT-BC-02.txt config2
RITE2-TeamX-CT-BC-03.txt config3
ExtraBC RITE2-TeamX-CT-ExtraBC-01.txt config1
RITE2-TeamX-CT-ExtraBC-02.txt config2
RITE2-TeamX-CT-ExtraBC-03.txt config3
RITE4QA RITE2-TeamX-CT-RITE4QA-01.txt config1
RITE2-TeamX-CT-RITE4QA-02.txt config2
RITE2-TeamX-CT-RITE4QA-03.txt config3
OptionalRITE4QA RITE2-TeamX-CT-OptionalRITE4QA-01.txt config1
RITE2-TeamX-CT-OptionalRITE4QA-02.txt config2
RITE2-TeamX-CT-OptionalRITE4QA-03.txt config3

提交資料格式

提交資料格式請參閱Submission Format。 提交結果前請務必使用檢驗工具檢查格式內容是否正確,任何無法通過檢驗工具檢查的結果檔將不列入最後會議評鑑項目。

注意

當您收到formal run資料集後,請停止一切系統開發動作。(系統輸入輸出修正以及Bug修正則不在此限) 特別注意的是,請勿根據formal run資料集,以人工干預的方式更動任何系統資源。 以下為一些相關案例:

  • 不合法案例
    • 根據formal run資料集句子,人工建立任何推論規則或字典
    • 使用任何crowd sourcing平台,如Mechanical Turk, Yahoo Answer,來發展推論規則以及其他資料。
  • 合法案例
    • 利用程式自動根據formal run句子進行前處理。
    • 使用程式根據formal run用字,自動到網路上蒐集、擴充有用的相關資料。

如對以上限制有任何問題,請來函到RITE-2 mailing list (ntcir10-rite2 at googlegroups.com) 。

當您在產生某句對的結果時,禁止參考其他資料集或同資料集內其他句對資料。比如說,在產生BC資料集結果時,不應該有任何程式碼去讀取ExtraBC資料集內容。


—- 以下資訊尚未翻譯,為原英文頁面資料 —


How to upload the results for the formal run

You can upload the results for the formal run from the following site (EasyChair). The system description paper will be also submitted from this site. All of the results for the formal run tasks you participated should be uploaded together.

formal run submission page:

https://www.easychair.org/conferences/?conf=ntcir10rite2

In RITE-2, we decided to use EasyChair for submitting both formal run results and system description papers. When submitting formal run results, please specify the zipped result file in the part “PDF”. For submitting your formal run results, please carefully read How to submit your formal run results and How to re-submit your formal run results. Please mail to the RITE-2 mailing list (ntcir10-rite2 at googlegroups.com) if you have any questions.

Making formal run data for submission

You can submit up to three runs for each subtask. Please zip all of the results together with a system description. The filename of your formal run data should be RITE2-(TeamID)-(Lang)-(SubtaskName)-(RunNumber).txt.

  • TeamID: your team ID (e.g. TOHOK)
  • Lang: the language of the subtask (JA, CS or CT)
  • SubtaskName: the subtask name (e.g. BC)
  • RunNumber: 01, 02 or 03

Let us consider a case where TeamID is TOHOK and participating tasks are JA-BC and JA-ExamBC. Then the files you should prepare are as follows.

+--TOHOK
|  +--RITE2-TOHOK-description.txt
|  +--RITE2-TOHOK-JA-BC-01.txt
|  +--RITE2-TOHOK-JA-BC-02.txt
|  +--RITE2-TOHOK-JA-BC-03.txt
|  +--RITE2-TOHOK-JA-ExamBC-01.txt
|  +--RITE2-TOHOK-JA-ExamBC-02.txt
|  +--RITE2-TOHOK-JA-ExamBC-03.txt

After zipping all of the files as TOHOK.zip, please send the zipped file from the EasyChair site. For more details, see submit your results or re-submit your results. In the ExamSearch subtask, if you intend to submit the search results of t1, please set the filename as RITE2-(TeamID)-(Lang)-(SubtaskName)-(RunNumber)-ID.txt. For example, if you send three runs together with their search results of t1, then the files will be as follows.

+--TOHOK	 
|  +--RITE2-TOHOK-description.txt	 
|  +--RITE2-TOHOK-JA-ExamSearch-01.txt	 
|  +--RITE2-TOHOK-JA-ExamSearch-01-ID.txt	 
|  +--RITE2-TOHOK-JA-ExamSearch-02.txt	 
|  +--RITE2-TOHOK-JA-ExamSearch-02-ID.txt	 
|  +--RITE2-TOHOK-JA-ExamSearch-03.txt	 
|  +--RITE2-TOHOK-JA-ExamSearch-03-ID.txt

In the description file, please write detailed description of your system used in the runs for each subtasks you participated. The description must follow the template. The information you wrote will be used to write the RITE-2 overview paper.

More precisely, for each question, only applicable items should be left.

TOHOK-JA-BC-01
system: Extracted a set of morphemes from t1 and t2, and decide on the label based on the percentage of overlap between the sets of morphemes.
questionnaire:
1. what approach your system can be categorized into?
* statistical (e.g. machine learning)

2. fundamental approach of your system (you can choose more than one) 
* overlap
...

TOHOK-JA-BC-02
system: Extracted a set of morphemes from t1 and t2, and decide on the label based on the percentage of overlap between the sets of morphemes.
questionnaire:
1. what approach your system can be categorized into?
* statistical (e.g. machine learning)

2. fundamental approach of your system (you can choose more than one) 
* overlap
...

Each result of the formal run will be evaluated by Evaluation Tool. Please carefully read Submission Format and check the format of the files for submission using the Evaluation Tool.

In the period between January 9th (00:00 UTC-12) and January 16th (23:59 UTC-12), you can submit the results as many time as you want. The submitted file at last will be registered as the official results.

How to submit your formal run results

This section describes a process of a new submission. If you intend to re-submit the results, please see How to re-submit your formal run results.

  1. Login to EasyChair
  2. Move to the result submission form
    • By clicking New Submission at the upper left, then you can see the following page.
  3. Input each item
    1. Please input the authors' names, mail-addresses, country names and affiliations. Since these information can be modified in paper submission, only inputting the representative author is allowed. Please specify one or more than one authors as “Corresponding author”.
    2. Please input TeamID in “Title” and a brief description of your system in “Abstract”.
    3. Please specify three keywords (but this information is not used anywhere).
    4. Please specify all of the subtasks you intend to attend.
    5. You will see the following error message if the form is incomplete.
  4. Choosing the submission file
    • Please specify the zipped result file in the form (while the form is written as Upload Paper, please don't case). You do not need to upload your paper.
  5. Submission
    • After you entered all of the items, click “Submit” button.
  6. You will see the following page if the submission succeeded.

How to re-submit your formal run results

  1. Login to EasyChair
  2. Re-submission and modification of the information
    • By clicking “Paper x”, the page that includes information of submitted results will be opened. You can change paticipating subtasks or resubmit youre formal run results.
  3. Re-submitting results
    • By clicking “Submit a new version”, you will see the following page. You can submit a new (zipped) result from this page. The file submitted at last will be registered as the official results.