Reputation: 81
So here's the code I'm working with. When I open the tweetsentiment.csv file, the tweets are all printed out in one line. Also, I'm only getting every other line outputted into this file.
streamedtweets = open('tweetdb', 'r')
outputfile = []
for row in streamedtweets:
stline = streamedtweets.readline() #do i need this?
processedStreamed = processTweet(stline)
streamedsentiment = NBClassifier.classify(extract_features(getFeatureVector(processedStreamed)))
outputfile.append((streamedsentiment, stline))
os.chdir(r'C:\Users\wildcat\Downloads\NLTK')
with open('tweetsentiment.csv', 'w', newline = '' ) as output:
os.chdir(r'C:\Users\wildcat\Downloads\NLTK')
a = csv.writer(output, delimiter = ',', lineterminator='\n',)
data = [outputfile]
a.writerow(data)
Upvotes: 0
Views: 533
Reputation: 76
The tweets are being printed into a single line because writerow([outputfile])
writes a single row. Instead you can use the writerows(outputfile)
method.
Example:
outputfile = [("positive", "a tweet"), ("negative", "a tweet")]
output = open("tweetsentiment.cvs", 'w')
writer = csv.writer(output, delimiter = ',', lineterminator='\n',)
writer.writerows(outputfile)
This should give you the following output:
positive,a tweet
negative,a tweet
As for the second part of the question. No you do not need streamedtweets.readline()
.
In fact the readline()
in combination with using a for loop is why you are skipping rows since both of them move the file pointer forwards.
Upvotes: 1
Reputation: 42786
Looks like the problem is reading the lines, you are not doing it properly, try this example within your full code:
for stline in streamedtweets.readlines():
processedStreamed = processTweet(stline)
streamedsentiment = NBClassifier.classify(extract_features(getFeatureVector(processedStreamed)))
outputfile.append((streamedsentiment, stline))
os.chdir(r'C:\Users\wildcat\Downloads\NLTK')
this way you are not mixing readline in the loop, but getting a single line in each loop pass.
Upvotes: 0