user20977195
user20977195

Reputation: 31

How to select rows from csv based on column value and write to another csv in Python

I have a csv file (called "infile") that looks like below. if the value in the first column is "student" and the value in the second column is "good", I will write the 3rd and 4th columns of that row into a new file called "outfile", and add header for it called "points" and "rank".

I know how to do it in Pandas, but how to do this with the csv module in python? I have tried reader (to read lines from the infile), writer, for loop (and if statements), writerow(), but it never works. Thanks for your help.

role status score result
student good 90 pass
staff NA NA NA
student good 98 pass
student poor 50 fail

Upvotes: 0

Views: 767

Answers (1)

Tickloop
Tickloop

Reputation: 76

The general idea is what you have outlined: Read each row and decide whether to write to output file or not

import csv


def keep_row(row):
    return row['role'] == 'student' and row['status'] == 'good'
    
if __name__ == '__main__':
    data = []
    INFILE = 'infile.csv'
    OUTFILE = 'outfile.csv'
    
    with open(INFILE, 'r') as f:    
        reader = csv.DictReader(f)
        for row in reader:
            if keep_row(row):
                data.append({ 'points': row['score'], 'rank': row['result'] })
    
    with open(OUTFILE, 'w', newline='') as f:    
        writer = csv.DictWriter(f, fieldnames=data[0].keys())
        writer.writeheader()
        for row in data:
            writer.writerow(row)
            

If you wish to reference docs: https://docs.python.org/3/library/csv.html

Upvotes: 0

Related Questions