Reputation: 37
I wanted to great a list that is added as new row to a dataframe.
import nltk
import pandas as pd
from nltk.corpus import wordnet
import pandas as pd
import numpy as np
Overviewdataframe = pd.DataFrame([])
synonyms = []
for syn in wordnet.synsets("active"):
for l in syn.lemmas():
synonyms.append(l.name())
Overviewdataframe = Overviewdataframe.append(synonyms)
synonyms = []
Instead the row is added as column. Can you help me please!
Thank you.
Upvotes: 1
Views: 360
Reputation: 122052
from itertools import chain
import pandas as pd
from nltk.corpus import wordnet as wn
wordlist = ['active', 'fan', 'hop', 'grace']
words2lemmanames = [{'word': word, 'synset':ss.name(), 'lemma_names':ss.lemma_names()}
for word in wordlist for ss in wn.synsets(word)]
pd.DataFrame(words2lemmanames)
When querying the WordNet interface in NLTK, querying a word returns a "concept" also known as "synset"
>>> wn.synsets('active')
[Synset('active_agent.n.01'), Synset('active_voice.n.01'), Synset('active.n.03'), Synset('active.a.01'), Synset('active.s.02'), Synset('active.a.03'), Synset('active.s.04'), Synset('active.a.05'), Synset('active.a.06'), Synset('active.a.07'), Synset('active.s.08'), Synset('active.a.09'), Synset('active.a.10'), Synset('active.a.11'), Synset('active.a.12'), Synset('active.a.13'), Synset('active.a.14')]
Each synset has its own list of lemma names, i.e.
>>> wn.synsets('active')[0].lemma_names()
['active_agent', 'active']
You can also access the synset directly with their "name", usual convention for the "name" is the (i) first lemma name then dot (ii) the POS tag and dot (ii) the index number.
>>> wn.synsets('active')[0] == wn.synset('active_agent.n.01')
True
Finally, given a list of key-value pairs (i.e. dictionary object), you can feed it into a pandas.DataFrame
to convert it into a dataframe.
Upvotes: 1