R1fromBCN
R1fromBCN

Reputation: 21

How to replace values according to non-consecutive list of indices in pandas dataframe?

I have been looking for a quick and efficient way to replace values in a given column of a pandas dataframe only if the corresponding indices belong to a list of non-consecutive numbers. Say non_consecutive_indices_list = [1400, 6571, 14526, 68420, ...] with ca. 30k different values.

  1. I have tried to use .iloc but I get an error message:

    df.iloc[non_consecutive_indices_list, 'column'] = New_Value
    

ValueError: Can only index by location with a [integer, integer slice (START point is INCLUDED, END point is EXCLUDED), listlike of integers, boolean array].

  1. Alternatively, I have tried the following:

    df.iloc[non_consecutive_indices_list].column = New_Value
    

But this returns a copy of the original dataframe so I cannot replace the original values.

  1. Finally, I have tried to use a for loop, which works, but this is very inefficient and takes forever to replace about 30k values:

    for i in non_consecutive_indices_list:
        df.loc[i, 'column'] = New_Value
    

Any idea how this could be done in a fastest way?

Upvotes: 2

Views: 1484

Answers (1)

jezrael
jezrael

Reputation: 862611

Change iloc to DataFrame.loc function:

df = pd.DataFrame({'column':list('abcdefghij')})

non_consecutive_indices_list = [2,4,1,6]
df.loc[non_consecutive_indices_list, 'column'] = 'New_Value'
print (df)
      column
0          a
1  New_Value
2  New_Value
3          d
4  New_Value
5          f
6  New_Value
7          h
8          i
9          j

Upvotes: 2

Related Questions