Mia
Mia

Reputation: 579

Python, Pandas and for loop: Populate dataframe row based on a match with list values

I have a pandas dataframe with an "id" column. I also have a list called 'new_ids', which is a subset of the values found in the "id" column.

So I want to add a column to the pandas dataframe, which indicates whether the ID is new or not. I first initialized this column to 0.

df['new_id'] = 0

Now I want to loop through the new_id list, and whenever the ID is found in my pandas dataframe "id" column, I want to change the 'new_id' value for the row, which belongs to this ID to 1. So later on, all the IDs, which are new will have a 1 assigned to them in the "new_id" column, and all old IDs will remain at 0.

index = df.index.values 

for x in index:
    if new_ids in df.id:
        df.new_id[x] = '1'
        x = x + 1
    else:
        x = x + 1

This somehow does not work, I am getting a lot of errors. Any idea what I am doing wrong? Many thanks!

Upvotes: 0

Views: 1171

Answers (1)

Wenlong Liu
Wenlong Liu

Reputation: 444

Actually you do not need to iterate manually in DataFrame. Pandas will do the work for you. It is pretty easy and straightforward to use builtin method to do the work.

Here are some sample codes.

import pandas as pd
sample = [['a','b','c'],[1,2,3],[4,5,6],['e','f','g']]
df = pd.DataFrame(sample, columns = ['name', 'ids', 'value'])


new_ids = ['b',5]
df['new_id'] = df['ids'].isin(new_ids)

Upvotes: 1

Related Questions