Reputation: 119
I have a list
abc = ['date1','sentence1','date2','sentence2'...]
I want to do sentiment analysis on the sentences. After that I want to store the results in a list that looks like:
xyz =[['date1','sentence1','sentiment1'],['date2','sentence2','sentiment2']...]
For this I have tried following code:
def result(doc):
x = 2
i = 3
for lijn in doc:
sentiment = classifier.classify(word_feats_test(doc[i]))
xyz.extend(([doc[x],doc[i],sentiment])
x = x + 2
i = i + 2
The len(abc) is about 7500. I start out with x as 2 and i as 3, as I don't want to use the first two elements of the list.
I keep on getting the error 'list index out of range', no matter what I try (while, for loops...)
Can anybody help me out? Thank you!
Upvotes: 1
Views: 1242
Reputation: 2480
It's simple. you can try it:
>>> abc = ['date1','sentence1','date2','sentence2'...]
>>> xyz = [[ abc[i], abc[i+1], "sentiment"+ str(i/2 + 1)] for i in range(0, len(abc), 2) ]
>>> xyz
output : [['date1', 'sentence1', 'sentiment1'], ['date2', 'sentence2', 'sentiment2'], .....]
Upvotes: 0
Reputation: 180401
If you want two elements from your list at a time, you can use a generator then pass the element/s to your classifier:
abc = ["ignore","ignore",'date1','sentence1','date2','sentence2']
from itertools import islice
def iter_doc(doc, skip=False):
it = iter(doc)
if skip: # if skip is set, start from index doc[skip:]
it = iter(islice(it, skip, None))
date, sent = next(it), next(it)
while date and sent:
yield date, sent
date, sent = next(it, ""), next(it, "")
for d, sen in result(abc, 2): # skip set to to so we ignore first two elements
print(d, sen)
date1 sentence1
date2 sentence2
So to create you list of lists xyz you can use a list comprehension:
xyz = [ [d,sen,classifier.classify(word_feats_test(sen))] for d, sen in iter_doc(abc, 2)]
Upvotes: 0
Reputation: 3162
Try this
i =0
for i in xrange(0,len(doc) -1)
date = doc[i]
sentence = doc[i + 1]
sentiment = classifier.classify(word_feats_test(sentence))
xyz.append([date,sentence,classifier])
Only need one index. The important thing is knowing when to stop.
Also, check out the difference between extend and append
Finally I would suggest you store your data as a list of dictionaries rather than a list of lists. That lets you access the items by field name rather than index , which makes for cleaner code.
Upvotes: 0
Reputation: 458
As comments mentioned - we won't be able to help You with finding error in Your code without stacktrace. But it is easy to solve Your problem like this:
xyz = []
def result(abc):
for item in xrange(0, len(abc), 2): # replace xrange with range in python3
#sentiment = classifier.classify(word_feats_test(abc[item]))
sentiment = "sentiment" + str(1 + (item + 1) / 2)
xyz.append([abc[item], abc[item + 1], sentiment])
You might want to read about built-in functions that makes programmers life easy. (Why worry about incrementing if range has that already?)
#output
[['date1', 'sentence1', 'sentiment1'],
['date2', 'sentence2', 'sentiment2'],
['date3', 'sentence3', 'sentiment3'],
['date4', 'sentence4', 'sentiment4'],
['date5', 'sentence5', 'sentiment5']]
Upvotes: 1