Reputation: 1
I am importing some data from both a csv and an excel file using pandas, which are now dataframe types. And I am trying to use the csv to update the data in the excel file. I have the data being read correctly, but when I try to loop through the data to find if a certain key already exists, my if statement is not working.
for ID in update['Index']:
if ID not in data['index']:
df1: data
Index Attr-1 Attr-2 Attr-3
01234 Blue Car Water
23456 Green Truck Lemonade
34567 Red Bike Milk Tea
df2: update
Index Attr-1 Attr-2 Attr-3
01234 Blue Car Milk Tea
34567 Yellow Truck Lemonade
56789 Red Bike Milk Tea
actual result:
Index Attr-1 Attr-2 Attr-3
01234 Blue Car Milk Tea
01234 Blue Car Water
23456 Green Truck Lemonade
23456 Green Truck Lemonade
34567 Red Bike Milk Tea
34567 Yellow Truck Lemonade
56789 Red Bike Milk Tea
desired result:
Index Attr-1 Attr-2 Attr-3
01234 Blue Car Milk Tea
23456 Green Truck Lemonade
34567 Yellow Truck Lemonade
56789 Red Bike Milk Tea
My values are being duplicated because the values are not getting caught by the if statement. Not too sure what is going on? Any feedback/ideas are appreciated. Thanks.
Upvotes: 0
Views: 62
Reputation: 1
The dataframe is not something I am allowed to use "in" on, which is why my for loop was not passing, so I created an array to hold the keys, data['index'].
`s = data['index'].tolist()
for ID in update['index']:
if ID not in s:`
Upvotes: 0
Reputation: 1932
Please see if this works, you need to check for the ID in index:
>>> data = pd.DataFrame({'Attr-1':['Blue','Green','Red']},index=['01234','23456','34567'])
>>> update = pd.DataFrame({'Attr-1':['Blue','Yellow','Green']},index=['01234','34567','56789'])
>>> for ID in update.index:
... if ID not in data.index:
... data = data.append(update.loc[ID])
...
>>> data
Attr-1
01234 Blue
23456 Green
34567 Red
56789 Green
Upvotes: 2