Reputation: 567
I am trying to do aspect-based sentiment analysis. When I try to find the aspect as well as opinion using a dictionary, I got some of the aspects pair many times in the dictionary. My code is:
aspects_main = []
feature_main =[]
feautures_term_main =[]
txt = "great hotel jacuzzi bath!. really lovely hotel. stayed very top floor and surprised jacuzzi bath not know getting! staff friendly and helpful and included breakfast great! great location and great value money. not want leave!"
nlp=spacy.load("en_core_web_sm")
doc_main = nlp(txt)
for i, sentence in enumerate(doc_main.sents):
aspects = []
feature =[]
feautures_term =[]
sentence= str(sentence)
doc = nlp(sentence)
descriptive_term = ''
target = ''
for token in doc:
if (token.dep_ == 'nsubj' and token.pos_ == 'NOUN') or (token.pos_ == 'NOUN'):
target = token.text
if token.pos_ == 'ADJ':
prepend = ''
for child in token.children:
if child.pos_ != 'ADV':
continue
prepend += child.text + ' '
descriptive_term = prepend + token.text
if((target=='') or (descriptive_term=='')):
continue
else:
aspects.append({'aspect': target,
'opinion': descriptive_term})
feautures_term.append(descriptive_term)
feature.append(target)
aspects_main.append(aspects)
feautures_term_main.append(feautures_term)
feature_main.append(feature)
print(aspects_main)
I want to remove the duplicated ones and keep one of them. I tried this solution and the code is:
L=[[{'aspect': 'hotel', 'opinion': 'great'}, {'aspect': 'hotel', 'opinion': 'great'}],[]]
L=[dict(s) for s in set(frozenset(d.items()) for d in L)]
L
It gives me error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-172-b649b849dec9> in <module>()
1 L=[[{'aspect': 'hotel', 'opinion': 'great'}, {'aspect': 'hotel', 'opinion': 'great'}],[]]
2
----> 3 L=[dict(s) for s in set(frozenset(d.items()) for d in L)]
4 L
<ipython-input-172-b649b849dec9> in <genexpr>(.0)
1 L=[[{'aspect': 'hotel', 'opinion': 'great'}, {'aspect': 'hotel', 'opinion': 'great'}],[]]
2
----> 3 L=[dict(s) for s in set(frozenset(d.items()) for d in L)]
4 L
AttributeError: 'list' object has no attribute 'items'
I tried using the loop. and here is the code:
a=[]
for i in range(len(aspects_main)):
aa=[]
for j in range(len(aspects_main[i])):
aa.append(aspects_main[i][j])
aa=set(aa)
a.append(aa)
print(a)
But got the error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-182-b87e8b70dd59> in <module>()
4 for j in range(len(aspects_main[i])):
5 aa.append(aspects_main[i][j])
----> 6 aa=set(aa)
7 a.append(aa)
8
TypeError: unhashable type: 'dict'
How can I do this?
My given output is :
[[{'aspect': 'hotel', 'opinion': 'great'}, {'aspect': 'hotel', 'opinion': 'great'}],[{'aspect': 'location', 'opinion': 'great'}, {'aspect': 'location', 'opinion': 'great'}, ][]]
and I want that (expected output):
[[{'aspect': 'hotel', 'opinion': 'great'}],[{'aspect': 'location', 'opinion': 'great'}]]
Upvotes: 1
Views: 137
Reputation: 21
The reason for your error is that you have a list within a list (L is a list of lists), and when calling d.items() for d in L you mistakenly trying to extract items of a list.
This may solve what you're trying to do:
new_list = []
for list in L:
no_dup_l = [dict(s) for s in set(frozenset(d.items()) for d in list)]
if no_dup_l:
new_list.append(no_dup_l)
personally, I wouldn't try to write this as one liner as it will harm readability (you already have 2 "for"s in your list comprehension)
Upvotes: 1