geodranic
geodranic

Reputation: 145

Pandas - Overwrite single column with new values, retain additional columns; overwrite original files

Fairly new to python, I have a csv with 2 columns, I need the code to perform a simple calculation on the first column while retaining the information in the second. code currently performs the calculation(albeit only on the first csv in the list, and there are numerous). But I haven't figured out how to overwrite the values in each file while retaining the second column unchanged. I'd like it to save over the original files with the new calculations. Additionally, originals have no header, and pandas automatically assigns a numeric value.

import os
import pandas as pd

def find_csv(topdir, suffix='.csv'):
    filenames = os.listdir(topdir)
    csv_list = [name for name in filesnames if name.endswith(suffix)
    fp_list = []
    for csv in csv_list:
        fp = os.path.join(topdir, csv)
        fp_list.append(fp)
    return fp_list

def wn_to_um(wn):
    um = 10000/wn
    return um

for f in find_csv('C:/desktop/test'):
    readit = pd.read_csv(f, usecols=[0])
    convert = wn_to_um(readit)
    df = pd.DataFram(convert)
    df.to_csv('C:/desktop/test/whatever.csv')

Upvotes: 1

Views: 1836

Answers (3)

Joe
Joe

Reputation: 889

Just update your second function as:

def wn_to_um(wn):
    wn.iloc[:,0] = 10000/wn.iloc[:,0]
    return wn

Upvotes: 1

angrymantis
angrymantis

Reputation: 382

Say you have a column named 'X' which you want to divide by 10,000. You can store this as X and then divide each element in X like so:

X = df['X']
new_x = [X / 10000 for i in X]

From here, rewriting the column in the dataframe is very simple:

df['X'] = new_x

Upvotes: 2

Ashwini
Ashwini

Reputation: 393

I suppose you just have to do minor changes to your code.

def wn_to_um(wn):
   wn.iloc[:,0] = 10000/wn.iloc[:,0] #performing the operation on the first column
   return wn

for f in find_csv('C:/desktop/test'):
   readit = pd.read_csv(f) #Here read the whole file
   convert = wn_to_um(readit) #while performing operation, just call the function with the second column
   os.remove(f) #if you want to replace the existing file with the updated calculation, simply delete and write
   df.to_csv('C:/desktop/test/whatever.csv')

Upvotes: 3

Related Questions