move_slow_break_things
move_slow_break_things

Reputation: 847

Modifying a CSV file in Python

Now I know it's usually not feasible to modify a csv file as you are reading from it so you need to create a new csv file and write to it. The problem I'm having is preserving the original order of the data.

The input csv file looks like follows:

C1       C2         C3
apple    BANANA     Mango
pear     PineApple  StRaWbeRRy

I want to turn all the data into lower case and output a new csv file that looks like:

C1       C2         C3
apple    banana     mango
pear     pineapple  strawberry

So far I can iterate through the input csv file and turn all the values into lower case but I don't know how to rewrite it back into a csv file in that format. The code I have is:

def clean (input)
  aList = []
  file = open(input, "r")
  reader = csv.reader(file, delimiter = ',')
  next(reader, None) # Skip the header but I want to preserve it in the output csv file
  for row in reader:
     for col in row:
        aList.append(col.lower())

So now I have a list with all the lowercase data, how do I rewrite it back into a csv file of the same format (same number of rows and columns) as the input including the header row that I skipped in the code.

Upvotes: 4

Views: 75085

Answers (4)

Nabin
Nabin

Reputation: 11776

Pandas way:

Read the file using pandas and get the dataframe. Then you can simply use lower()

import pandas as pd

def conversion(text):
    return text.lower()
    

df = pd.read_csv(file_path)
df[column_name] = df[column_name].map(conversion)

Or even a single liner:

df[column_name] = df[column_name].apply(lambda x: x.lower()) # If you have nan or other non-string values, you may need to convert x to string first like str(x).lower()

Then you can save it using to_csv function

Upvotes: 11

rawwar
rawwar

Reputation: 4992

the easiest way that i found is as follows let the initial CSV file name be test.csv

with open('test.csv','r') as f:
    with open('cleaned.csv','w') as ff:
        ff.write(f.readline())
        ff.write(f.read().lower())

the above code will create a new csv with all lower case

Upvotes: 0

sonus21
sonus21

Reputation: 5388

If you want to use csv module for all then use following code snippet.

import os
import csv


def clean(input):
    tmpFile = "tmp.csv"
    with open(input, "r") as file, open(tmpFile, "w") as outFile:
        reader = csv.reader(file, delimiter=',')
        writer = csv.writer(outFile, delimiter=',')
        header = next(reader)
        writer.writerow(header)
        for row in reader:
            colValues = []
            for col in row:
                colValues.append(col.lower())
            writer.writerow(colValues)
    os.rename(tmpFile, input)

Upvotes: 3

Tim
Tim

Reputation: 2637

If all you want to do is change the case of the data and preserve everything else you might be best to skip the csv module and just use a straight file eg:

# Open both files
with open("infile.csv") as f_in, open("outfile.csv", 'w') as f_out:
    # Write header unchanged
    header = f_in.readline()
    f_out.write(header)

    # Transform the rest of the lines
    for line in f_in:
        f_out.write(line.lower())

Upvotes: 5

Related Questions