Daniel Hutchinson
Daniel Hutchinson

Reputation: 165

Python & Pandas: Writing data to specific columns in csv

While using Python and Pandas, I'm running a script that analyzes txt files for word count and lexile scores. I can successfully run the script and write to csv. However, my output delivers unexpected values, and I'm having difficulty writing the data to the specific column.

Here is code:

import pandas as pd
import textstat
import csv

header = ["word_count", "flech"]

with open('data.csv', 'w', encoding='UTF8') as f:
    writer = csv.writer(f)

    writer.writerow(header)
    
for text_number in range(0, 2):

    f = open(f'\TXTs\text_{text_number}.txt', 'r')

    if f.mode == 'r':
        contents = f.read()
        
    text_data = (contents)

    word_count = textstat.lexicon_count(text_data, removepunct=True)
    flech = textstat.flesch_kincaid_grade(text_data)
   
    wc = pd.DataFrame([word_count])
    fl = pd.DataFrame([flech])
    
    def wc_count():
        wc.to_csv('output.csv', mode="a", header="word_count", index=False)
        
    def fl_count():
        fl.to_csv('output.csv', mode="a", header="flech", index=False)

    wc_count()
    fl_count()

I'd like the output to look like this, with the 2 & 271 values in the "word_count" column, and the -3.1 and 13 in the "flech" column:

word_count, flech
2, -3.1
271, 13

However, the output produced looks like this:

word_count, flech
    
0   
2   
0   
-3.1    
0   
271 
0   
13  

Clearly, I've got some problems with my output. Any assistance would be greatly appreciated.

Upvotes: 0

Views: 1937

Answers (2)

mozway
mozway

Reputation: 260735

It looks like you're going through great lengths for something that seems quite straightforward. Just use pandas' I/O function to read/write your data: pandas.read_csv and pandas.DataFrame.to_csv

It is hard to give you the exact code without the data, but try something like:

with open(f'\TXTs\text_{text_number}.txt', 'r') as f:
    text_data = f.read()

word_count = textstat.lexicon_count(text_data, removepunct=True)
flech = textstat.flesch_kincaid_grade(text_data)

df = pd.DataFrame({'word_count': word_count, 'flech': flech})

df.to_csv('output.csv', index=False)

Upvotes: 1

Muhammad Rasel
Muhammad Rasel

Reputation: 724

Instead of creating two dataframe try creating one and write in csv.

flech = textstat.flesch_kincaid_grade(text_data) # change after this line
output_df = pd.DataFrame({"word_count":[word_count], "flech":[flech])
output_df.to_csv('output.csv', mode="a", index=False)

Upvotes: 1

Related Questions