Reputation: 3100
I'm importing a csv as a dataframe and then I'm updating a column in the dataframe and exporting it to a new csv file. However, the first update statement appears to be updating all rows for the entire column (instead of just for that row). I'm using df.at
with the index as I iterate through the dataframe so I'm lost as to what I'm doing wrong. Any help would be greatly appreciated.
import pandas as pd
import numpy as np
# import spreadsheet data
df = pd.read_csv('C:\\Users\\admin\\Documents\\Projects\\data\\data.csv', dtype=str)
df = df.replace(np.nan, '', regex=True)
def write_file():
for index, row in df.iterrows():
type = str(df['Type'].values[0])
if "test" in type:
df.at[index, "A"] = "Update 1"
if "test2" in department:
df.at[index, 'B'] = "Update 2"
write_file()
df.to_csv('updated')
Upvotes: 0
Views: 56
Reputation: 150785
type = str(df['Type'].values[0])
doesn't this mean type is unchanged throughout your loop? Therefore, all A
and B
columns are updated to the same value. I believe you want
type = str(row['Type'])
On the other hand, you can make do without looping:
def write_file():
df.loc[df['Type'].astype(str).str.contains('test'), 'A'] = 'Update 1'
Upvotes: 1