Reputation: 33
First, let me specify a few things:
cleaned_example = [['I', 'horrible', 'physics', 'pretty', 'good', 'chemistry', 'excellent', 'math'], ['First', 'concerned', 'worried', 'scared'], ['What', 'started', 'bar', 'fight', 'turned', 'confrontation', 'finally', 'gang', 'war'], ['Every', 'hill', 'mountain', 'shall', 'made', 'low', 'rough', 'places', 'made', 'plain,', 'crooked', 'places', 'made', 'straight'], ['This', 'blessed', 'plot', 'earth', 'realm', 'England']]
but in reality is composed of 20 groups of words.
I am using the sentiment function which is part of pattern.en and I want to see if the sentiment values of the words in cleaned_example[0], cleaned_example[1].... are increasing or decreasing. The sentiment function outputs two values of the form (a,b) but I am only interested in the first of these values.
Here is what I have done so far. I am encountering two problems. First, I am getting 40 outputs when I should only be getting 20. Second, all of these outputs are 'no' so it is pretty useless.
for index in range(len(cleaned_example)):
for position in range(len(cleaned_example[index])-1):
if sentiment(cleaned_example[index][position][0]) < sentiment(cleaned_example[index][position+1][1]):
print('yes')
else:
print('no')
Thanks in advance!
Upvotes: 2
Views: 2761
Reputation: 69222
Since you have parameters (a,b)
and you only care about a
, then you always want [0]
when choosing between these; and I suspect that you have your parentheses in the wrong place as well. That is, your difference line should read:
if sentiment(cleaned_example[index][position])[0] < sentiment(cleaned_example[index][position+1])[0]:
so the final index listed is 0
and not 1
, and you use [0]
on the value returned from sentiment (ie, the [0]
is to the right of the function call).
This would read more easily if you didn't index cleaned_example
every time:
for word_list in cleaned_example:
for position in range(len(word_list)-1):
if sentiment(word_list[position])[0] < sentiment(word_list[position+1])[0]:
print('yes')
else:
print('no')
Finally, here you're calling sentiment
twice for almost every word. If that's a problem then you should restructure your code a bit. With this in mind, it would probably be better to start with something like:
for word_list in cleaned_example:
sentiments = [sentiment(word)[0] for word in word_list]
for i in range(len(sentiments)):
if sentiments[i] < sentiments[i+1]:
print('yes')
else:
print('no')
Upvotes: 1
Reputation: 28626
Going a step further than tom in avoiding indexing and using even more meaningful variable names (I googled that sentiment thing):
for sentence in cleaned_example:
for word, next_word in zip(sentence, sentence[1:]):
if sentiment(word)[0] < sentiment(next_word)[0]:
print('yes')
else:
print('no')
If for example sentence
is ['a', 'b', 'c', 'd']
, then sentence[1:]
is ['b', 'c', 'd']
and zipping them gives you word pairs ('a', 'b')
, ('b', 'c')
and ('c', 'd')
.
I don't know why you think you should get only 20 outputs (I get 36, btw, not 40). I suspect you are working at the wrong level and should use sentiment
on sentences, not words? Note how I named my variable names, good names can really help you understand your code. Triple indexes not so much.
Upvotes: 3
Reputation: 1008
If you are expecting to get only 20 outputs, is it correct to assume you only want either a "yes" or a "no" for each group of words? I am not certain how exactly you want to measure that because you can just compare the sentiment of the first word and the last word in a list of words for each set in cleaned_example
.
for words in cleaned_example:
if sentiment(words[0])[0] < sentiment(words[-1])[0]:
print "Increasing"
else:
print "Decreasing"
Another approach is to count the number of yes's and no's for each set of words.
for words in cleaned_example:
yes = 0
no = 0
for word1, word2 in zip(words, words[1:]):
if sentiment(word1)[0] < sentiment(word2)[0]:
yes = yes + 1
else:
no = no + 1
if yes > no:
print "yes"
elif yes < no:
print "no"
else:
print "`yes` = `no`"
Another approach is to obtain the average sentiment of a set and compare it with the sentiment of the first word.
import numpy as np
for words in cleaned_example:
a_values = []
for word in words:
a_values.append(sentiment(word)[0])
if sentiment(words[0])[0] < np.mean(a_values):
print "Increasing"
Upvotes: 0