winston
winston

Reputation: 3100

Updating dataframe updates entire column instead of the row

I'm importing a csv as a dataframe and then I'm updating a column in the dataframe and exporting it to a new csv file. However, the first update statement appears to be updating all rows for the entire column (instead of just for that row). I'm using df.at with the index as I iterate through the dataframe so I'm lost as to what I'm doing wrong. Any help would be greatly appreciated.

import pandas as pd
import numpy as np

# import  spreadsheet data
df = pd.read_csv('C:\\Users\\admin\\Documents\\Projects\\data\\data.csv', dtype=str)

df = df.replace(np.nan, '', regex=True)

def write_file():
    for index, row in df.iterrows():

        type = str(df['Type'].values[0])

            if "test" in type:

                df.at[index, "A"] = "Update 1"

            if "test2" in department:
                df.at[index, 'B'] = "Update 2"


write_file()

df.to_csv('updated')

Upvotes: 0

Views: 56

Answers (1)

Quang Hoang
Quang Hoang

Reputation: 150785

type = str(df['Type'].values[0]) doesn't this mean type is unchanged throughout your loop? Therefore, all A and B columns are updated to the same value. I believe you want

type = str(row['Type'])

On the other hand, you can make do without looping:

def write_file():
    df.loc[df['Type'].astype(str).str.contains('test'), 'A'] = 'Update 1'

Upvotes: 1

Related Questions