Reputation: 31
wn.synsets.definition(lang="lang")
show english and japanese result, but not other languages.
wn.synset('word').lemma_names
shows the other languages too, though.
Do I need extra download? , there is the difference between languages?
the documents says that it do lazy download. so I tried a few times, but result didn't change.
Upvotes: 3
Views: 402
Reputation: 2086
I played around a bit and the first thing I found out is that definitions are available for more languages than just English and Japanese. See the following table for definitions of a few words including your example word for all the languages available from wn.langs()
after downloading nltk omw-1.4
. 'dog' has definitions in 7 languages, 'house' in 9, and 'person' in 11.
Regarding the missing definitions for certain languages, I think the data just isn't present in the corresponding wordnets. The NLTK wordnet documentation states:
This module also allows you to find lemmas in languages other than English from the Open Multilingual Wordnet (https://omwn.org/)
If you go to https://omwn.org/ and follow the links for the respective wordnets, you'll find for example this page where you can search for words in a few languages. Searching 'casa' in Spanish, you'll find the definition reverts to the English definition for 'house', but for Italian there is a definition in Italian - which is consistent with the table below.
Hope this helps!
lang | dog | house | person |
---|---|---|---|
eng | a member of the genus Canis (probably descended from the common wolf) that has been domesticated by man since prehistoric times; occurs in many breeds | a dwelling that serves as living quarters for one or more families | a human being |
als | Ndërtesë për të banuar (zakonisht për një familje a për familje të një gjaku); banesë; apartament ku banon një familje. | të qënurit njeri | |
arb | |||
bul | Вид домашно животно от семейство хищни бозайници, с различна големина, цвят на козината и различни породи, което лае и често се използва като пазач на дома и имота, за лов, може да бъде дресирано и обучавано за различни служебни цели. | Сграда,помещение за постоянно живеене на отделно семейство или човек. | Отделен човек, който със своите неповторими качества се отличава, различава от другите хора. |
cmn | |||
dan | |||
ell | σκύλος του γένους Canis familiaris που συνήθως προέρχεται από τον κοινό λύκο και έχει εξημερωθεί από τους προϊστορικούς χρόνους | το τμήμα οικήματος (λ .χ. το διαμέρισμα πολυκατοικίας) στο οποίο διαμένει κανείς | το έμβιο ον, κάθε άτομο, άνθρωπος ανεξαρτήτως φύλου και ηλικίας |
fin | |||
fra | |||
heb | מבנה המשמש כמקום מגורים למשפחה אחת או יותר | מישהו דופק בדלת | |
hrv | |||
isl | |||
ita | mammifero domestico dei canidi, molto comune, diffuso in tutto il mondo, con attitudini varie a seconda della razza | edificio destinato ad abitazione | entità umana considerata in quanto tale, senza caratterizzazioni di sesso, età, provenienza, ecc. |
ita_iwn | animale domestico molto comune, diffuso in tutto il mondo, usato per la caccia, la difesa, nella pastorizia, e come animale da compagnia | essere distinto da ogni altro della medesima specie | |
jpn | 有史以前から人間に家畜化されて来た(おそらく普通のオオカミを先祖とする)イヌ属の動物 | 1家族以上のための居住棟として機能する住居 | 一人の人間 |
cat | |||
eus | |||
glg | |||
spa | |||
ind | seseorang yang dipandang tinggi | ||
zsm | |||
nld | |||
nno | |||
nob | |||
pol | |||
por | |||
ron | Animal mamifer carnivor domesticit, folosit pentru pază, vânătoare etc.. | construcție destinată pentru a servi de locuință uneia sau mai multor familii | Individ al speciei umane, om considerat prin totalitatea însușirilor sale fizice și psihice |
lit | |||
slk | |||
slv | |||
swe | |||
tha | |||
total | 7 | 9 | 11 |
Code used to generate the above table (in Google Colab):
import nltk
from nltk.corpus import wordnet as wn
nltk.download('wordnet')
nltk.download('omw-1.4')
import pandas as pd
defs = pd.DataFrame()
for lang in wn.langs():
for word in ['dog', 'house', 'person']:
this_word = {}
def_ = wn.synsets(word)[0].definition(lang=lang)
defs.at[lang, word] = def_[0] if isinstance(def_, list) else def_
defs[word] = defs[word].astype('object')
for word in defs.columns:
defs_present = len([def_ for def_ in defs[word].to_list() if def_ != None])
defs.at['total', word] = defs_present
defs
Upvotes: 2