Reputation: 1088
Suppose I have the following df,
d = {'col1':['cat','apple','banana','dog','pen']}
df= pd.DataFrame(d)
that gives
col1
0 cat
1 apple
2 banana
3 dog
4 pen
I want to make a dictionary and map it as a new column to my df, such that I get the following output:
col1 col2
0 cat pet
1 apple fruit
2 banana fruit
3 dog pet
4 pen thing
I have made the following dictionary:
dictionary = {
"pet": ['cat','dog'],
"fruit": ['apple','banana'],
"thing": 'pen'}
but not sure how to implement it as above, a tedious way of doing this is making one by one dictionary and then use map as:
di = {"cat": "pet", "dog": "pet", "apple": "fruit", "banana": "fruit", "pen":"thing"}
and
df['col2'] = df['col1'].map(di)
but this is not the most efficient way I suppose. I wonder how one does this task more efficiently?
Upvotes: 2
Views: 53
Reputation: 61910
Use a dictionary comprehension to explode the lists:
# transform all values to list
dictionary = {k: v if isinstance(v, list) else [v] for k, v in dictionary.items()}
# then explode the dictionary
df['col2'] = df['col1'].map({v: k for k, vs in dictionary.items() for v in vs})
print(df)
Output
col1 col2
0 cat pet
1 apple fruit
2 banana fruit
3 dog pet
4 pen thing
An alternative using only pandas (although more cumbersome):
# convert to Series
res = pd.DataFrame(data=list(dictionary.values()),
index=dictionary.keys()).stack().droplevel(-1).to_frame('vs').reset_index().set_index('vs').squeeze()
# use map with Series as parameter
df['col2'] = df['col1'].map(res)
print(df)
Output
col1 col2
0 cat pet
1 apple fruit
2 banana fruit
3 dog pet
4 pen thing
Upvotes: 2
Reputation: 3010
I would make a list of tuples and then create the dataframe from that list. It would be simpler if all of your values in the dict are lists instead of having strings for single values.
data = []
for k, v in dictionary.items():
if isinstance(v, str):
data.append((v, k))
else:
for vv in v:
data.append((vv, k))
df = pd.DataFrame(data)
Upvotes: 2