john
john

Reputation: 81

My data is being printed all on one line while writing into a CSV file

So here's the code I'm working with. When I open the tweetsentiment.csv file, the tweets are all printed out in one line. Also, I'm only getting every other line outputted into this file.

streamedtweets = open('tweetdb', 'r')
outputfile = []
for row in streamedtweets:
    stline = streamedtweets.readline() #do i need this?
    processedStreamed = processTweet(stline)
    streamedsentiment =  NBClassifier.classify(extract_features(getFeatureVector(processedStreamed)))
    outputfile.append((streamedsentiment, stline))
    os.chdir(r'C:\Users\wildcat\Downloads\NLTK')

with open('tweetsentiment.csv', 'w', newline = '' ) as output:
    os.chdir(r'C:\Users\wildcat\Downloads\NLTK')
    a = csv.writer(output, delimiter = ',', lineterminator='\n',)
    data = [outputfile]
    a.writerow(data)

Upvotes: 0

Views: 533

Answers (2)

eirikjak
eirikjak

Reputation: 76

The tweets are being printed into a single line because writerow([outputfile]) writes a single row. Instead you can use the writerows(outputfile) method.

Example:

outputfile = [("positive", "a tweet"), ("negative", "a tweet")]
output = open("tweetsentiment.cvs", 'w')
writer = csv.writer(output, delimiter = ',', lineterminator='\n',)
writer.writerows(outputfile)

This should give you the following output:

positive,a tweet
negative,a tweet

As for the second part of the question. No you do not need streamedtweets.readline(). In fact the readline() in combination with using a for loop is why you are skipping rows since both of them move the file pointer forwards.

Upvotes: 1

Netwave
Netwave

Reputation: 42786

Looks like the problem is reading the lines, you are not doing it properly, try this example within your full code:

for stline in streamedtweets.readlines():
    processedStreamed = processTweet(stline)
    streamedsentiment =  NBClassifier.classify(extract_features(getFeatureVector(processedStreamed)))
    outputfile.append((streamedsentiment, stline))
    os.chdir(r'C:\Users\wildcat\Downloads\NLTK')

this way you are not mixing readline in the loop, but getting a single line in each loop pass.

Upvotes: 0

Related Questions