Reputation: 833
I have a DataFrame that looks like this:
Market | Status | Team | Member |
-------|--------|------|--------|
Chicago| 1 | ENG | None |
Chicago| 1 | ENG | None |
SF Bay | 3 | ENG | Julia |
And a dictionary of users and their emails:
TeamMembers = {
"Julia": "[email protected]", "Tyler": "[email protected]", "Kyle": "[email protected]"
}
In my DataFrame I want to randomly assign a Member if there is none, but if the Market value is the same, then the Member needs to also be the same.
I want to use
name, email = random.choice(list(TeamMembers.items()))
to get the specific names and email addresses but I'm not sure how to manipulate the DataFrame based on the Market being the same value.
Upvotes: 3
Views: 195
Reputation: 164673
Here is an alternative solution. The benefit of this one is that if Chicago has been mapped once to a Member, other instances will get mapped to the same Member, even if currently None
.
import pandas as pd
import random
df = pd.DataFrame([['Chicago', 1, 'ENG', None],
['Chicago', 1, 'ENG', None],
['SF Bay', 3, 'ENG', 'Julia'],
['SF Bay', 2, 'ENG', None],
['NY', 1, 'ENG', None],
['NY', 2, 'ENG', None]],
columns=['Market', 'Status', 'Team', 'Member'])
TeamMembers = {"Julia": "[email protected]", "Tyler": "[email protected]", "Kyle": "[email protected]"}
existing_map = df.dropna(subset=['Member']).set_index('Market')['Member'].to_dict()
unmapped = list(set(df.loc[pd.isnull(df['Member']), 'Market']) - set(existing_map))
MemberChoices = list(TeamMembers.keys())
random.shuffle(unmapped)
random.shuffle(MemberChoices)
additional_map = {k: MemberChoices[i % len(MemberChoices)] for i, k in enumerate(unmapped)}
new_map = {**existing_map, **additional_map}
df['Member'] = df['Member'].fillna(df['Market'].map(new_map))
# Market Status Team Member
# 0 Chicago 1 ENG Tyler
# 1 Chicago 1 ENG Tyler
# 2 SF Bay 3 ENG Julia
# 3 SF Bay 2 ENG Julia
# 4 NY 1 ENG Kyle
# 5 NY 2 ENG Kyle
Upvotes: 2
Reputation: 323236
Without groupby
k=df.Market.unique().tolist()
list(TeamMembers.keys())
Out[31]: ['Julia', 'Tyler', 'Kyle']
d=dict(zip(k,random.sample(set(list(TeamMembers.keys())), 2)))
df.Member=df.Member.fillna(df.Market.map(d))
Upvotes: 0
Reputation: 862661
You can use transform
with fillna
, also generate only name
s by change item
s to key
s:
df['Member'] = (df.groupby('Market')['Member']
.transform(lambda x: x.fillna(random.choice(list(TeamMembers.keys())))))
print (df)
Market Status Team Member
0 Chicago 1 ENG Kyle
1 Chicago 1 ENG Kyle
2 SF Bay 3 ENG Julia
Upvotes: 4