MathiasRa
MathiasRa

Reputation: 835

Pandas: Updating multiple columns and rows based on the value of a seperate column

I am trying to create a code that as fast as possible can update a large number of pandas values at the same time based on the values of an ID column. The code I am using atm insert multiple rows at the same time base on an ID_column, but cannot insert multiple columns + rows at the same time.

The below code is a simplied example which updates 2 columns (value1 and value2) based on the value of column name ID.

id_list = [24,26,28] 
value1_list  =[100,200,300]
value2_list  =[125,175,165]

df.loc[df['id'].isin(id_list), 'value1'] = value1_list 
df.loc[df['id'].isin(id_list), 'value2'] = value2_list 

The code I ideally would use would work something like this:

df.loc[df['id'].isin(id_list), ['value1','value2']] = [value1_list,value2_list]

Meaning both column value1 and column value2 would be updated at the same time Is this doable?

Upvotes: 1

Views: 1682

Answers (2)

jpp
jpp

Reputation: 164623

You can construct a single NumPy array via np.vstack and then assign to multiple columns:

import numpy as np

mask = df['id'].isin(id_list)
value_array = np.vstack((value1_list, value2_list)).T
df.loc[mask, ['value1', 'value2']] = value_array

print(df)
#    id  value1  value2
# 0  24     100     125
# 1  26     200     175
# 2  28     300     165
# 3  30       4       8

Alternatively, you can zip your input lists:

df.loc[mask, ['value1', 'value2']] = list(zip(*(value1_list,value2_list)))

Upvotes: 1

Loochie
Loochie

Reputation: 2472

You can also use dictionary to enter columns corresponding to their updates.

mask = df['id'].isin(id_list)

dc = {'value1': value1_list, 'value2': value2_list}
for i, j in dc.items():
    df.loc[mask, i] = j

Upvotes: 0

Related Questions