wiki:resources

Publicly available resources

Here is a list of machine readable resources which are publicly available and possibly useful for the task.

Language Name Type Author Description Free?
JA Textual Entailment Evaluation Data Collection of labeled entailment pairsKyoto UniversityAn evaluation dataset with 2700 labeled textual entailment pairs. Each pair comes with a 4-level label {◎,◯,△,×} indicating the likelihood of entailment, and another label for one of 5 categories {implication, lexicon (noun), lexicon (verb), syntax, inference}.Yes
JAJapanese WordNet Lexical DB NICT Added Japanese equivalents to synsets of the Princeton WordNet 3.0. There are 56,741 concepts (synsets) and 92,241 words available as of v1.0. Demo is also available Yes
JA Wikipedia hypernym-hyponym pairs from Hyponymy extraction tool Ontology NICT This tool can extracts about 6 million pairs of hypernym-hyponym and category-instance from Japanese Wikipedia dump, in 90% accuracy. Yes
JA 京都大学格フレーム(Kyoto Univ Case Frame) Frame Dict Kurohashi Lab, Kyoto University Case frame dictionary automatically built from the web text. Search UI available here. Yes
JA 単語感情極性対応表 (Semantic Orientations of Words) Polarity Weighted Word List Okumura Lab, Titech List of words with semantic orientation value ranging between -1 and +1. E.g. “great” with +1 and “painful” with -1. Yes
JA EDR電子化辞書(The EDR Electronic Dictionary) Lexical DB NICT Japanese General Vocabulary with 270,000 words and corresponding 410,000 concepts and many more. No
JA, CS, CT Wikipedia Encyclopedia Free encyclopedia. Yes
JA 日本語語彙大系(GoiTaikei) Lexical DB NTT It contains 300,000 Japanese words marked with patr-of-speech and semantic classes, originally developed for the ALT-J/E Japanese-to-English machine translation system by NTT No
JA 分類語彙表(Bunrui Goihyo) Lexical DB 国立国語研究所 No
JA 動詞含意関係データベース(Entailment Verb DB) Lexical DB ALAGIN Large-scale Japanese verb phrase pairs consisting of 52,689 positive examples (pairs entailing) and 68,819 negative examples (pairs not entailing). This resource is available for ALAGIN members only (a member needs to be a resident of Japan). Yes
CS 知网(HowNet) Lexical DB Dong Zhendong & Dong Qiang Static demo available. Must submit an agreement form to download and use it. Yes (conditional)
CS 同义词词林(TongYiCi CiLin) Lexical DB 梅家驹,竺一鸣,高蕴琦等编. 上海辞书出版社. 1983. Thesaurus of synonyms and antonyms. ?
CS 哈工大《同义词词林》共享版的若干改进 Lexical DB 哈工大 Improved version of TongYiCi CiLin. Yes
CT BOW Lexical DB Academia Sinica 本資料庫以英文WordNet架構為基礎,並以以台灣地區的語言使用為經驗基礎。 ?

JA: Japanese, CS: Simplified Chinese, CT: Traditional Chinese

Other resources to be added to the table soon: OpenMWE for Japanese, IPAL dictionary, 動詞項構造シソーラス, 基本語データベース:語義別単語親密度, つつじ:日本語機能表現辞書, and some Chinese data listed in CNLP Platform