Phoenix
Phoenix

Reputation: 399

Adding new column based on index list in pandas

I have a pandas dataframe like,

pd.DataFrame({'f1':[23,56,7, 56,34, 98],
              'f2':[32,85,27, 36,64, 60]})

    f1  f2
0   23  32
1   56  85
2   7   27
3   56  36
4   34  64
5   98  60

and based on an index list like index_list = [2, 4] I want to add a new column to original datafram like following,

    new_column  f1  f2
0      0        23  32
1      0        56  85
2      0        7   27
3      1        56  36
4      1        34  64
5      2        98  60

Note: index list shows the locations that new_column should increase 1 integer up.

Upvotes: 0

Views: 2817

Answers (3)

constantstranger
constantstranger

Reputation: 9379

Here's a way to get the exact output specified in your question without the need for cumsum():

df = ( df.assign(new_column=pd.Series(
        range(1, 1+len(index_list)), 
        index=pd.Series(index_list)+1))
    .ffill().fillna(0).astype(int)[['new_column'] + list(df.columns)] )

Output:

   new_column  f1  f2
0           0  23  32
1           0  56  85
2           0   7  27
3           1  56  36
4           1  34  64
5           2  98  60

Upvotes: 1

Chris Seeling
Chris Seeling

Reputation: 656

A simple way is to use cumsum:

df = pd.DataFrame(index=range(6))
index_list = [2, 4]
index_list = [x+1 for x in index_list]
df["new"] = 0
df["new"].loc[index_list] = 1
df["new"].cumsum()

which gives:

0    0
1    0
2    0
3    1
4    1
5    2

Upvotes: 3

BeRT2me
BeRT2me

Reputation: 13242

# Put a 1 after the index of each index in your list as a new column.
df.loc[[x+1 for x in index_list], 'new_column'] = 1

# fill with 0's, and take the cumulative sum.
df.new_column = df.new_column.fillna(0).cumsum()

print(df)

Output:

   f1  f2  new_column
0  23  32         0.0
1  56  85         0.0
2   7  27         0.0
3  56  36         1.0
4  34  64         1.0
5  98  60         2.0

If your index list is actually an index:

# If index_list looks like:
>>> index_list
Int64Index([2, 4], dtype='int64')

# Then you can do:
df.loc[index_list+1, 'new_column'] = 1
...

Upvotes: 3

Related Questions