Reputation: 489
Below is sample dataframe
>>> df = pd.DataFrame({'a': [1, 1, 1, 2, 2], 'b':[11, 22, 33, 44, 55]})
>>> df
a b
0 1 11
1 1 22
2 1 33
3 2 44
4 3 55
Now I wanted to update/replace b values that are matched on a column from other dict based on index
ex:
match = {1:[111, 222], 2:[444, 555]}
output:
a b
0 1 111
1 1 222
2 1 33 <-- ignores this bcz not enough values to replace in match dict for 1
3 2 444
4 3 555
Thanks in advance
Upvotes: 4
Views: 1502
Reputation: 61910
You could use the pop function of list:
import pandas as pd
def pop(default, lst):
try:
return lst.pop()
except IndexError:
return default
df = pd.DataFrame({'a': [1, 1, 1, 2, 2], 'b': [11, 22, 33, 44, 55]})
match = {1: [111, 222], 2: [444, 555]}
df['b'] = df[['a', 'b']].apply(lambda e: pop(e[1], match[e[0]]), axis=1)
print(df)
Output
a b
0 1 222
1 1 111
2 1 33
3 2 555
4 2 444
if the order must be preserved, you can always pop the first item:
def pop(default, lst):
try:
return lst.pop(0)
except IndexError:
return default
Output
a b
0 1 111
1 1 222
2 1 33
3 2 444
4 2 555
UPDATE
A faster (non-destructive) way is to use deque:
def pop(default, lst):
try:
return lst.popleft()
except IndexError:
return default
match_deque = {k: deque(v[:]) for k, v in match.items()}
df['b'] = df[['a', 'b']].apply(lambda e: pop(e[1], match_deque[e[0]]), axis=1)
print(df)
Upvotes: 4
Reputation: 164623
Here's one way. The idea is to calculate a cumulative count by group and use this to filter rows. Use itertools.chain
to create a single array of values. Finally, use pd.DataFrame.loc
and Boolean indexing to set values.
from itertools import chain
count = df.groupby('a').cumcount() + 1
m1 = df['a'].isin(match)
m2 = count.le(df['a'].map(match).map(len))
values = list(chain.from_iterable(match.values()))
df.loc[m1 & m2, 'b'] = values
print(df)
a b
0 1 111
1 1 222
2 1 33
3 2 444
4 2 555
Upvotes: 4