Petru Tanas
Petru Tanas

Reputation: 1247

Map values from lists in one Dataframe to unique values in another

I have 2 dataframes, one containing some unique values, and another one containing unique groups of this unique values, with group ids, where each value appears in one group and one group only.

df1:         df2:
                groups    ids
0  A        0 (A, D, F)    1
1  B        1 (C, E)       2
2  C        2 (B, K, L)    3
3  D         .
.            .
.
.             

Is there an efficient way to map values from the first dataframe with ids from the second? I got the result using 2 for loops but it is very slow, and tried using 'np.where(df1 in df2["groups"])', but got an array of None.

Desired output:

df3:
       id
0  A    1
1  B    3
2  C    2
3  D    1
.
.
.

Upvotes: 2

Views: 404

Answers (2)

jezrael
jezrael

Reputation: 863166

If in column groups are tuples use nested dictionary comprehension with Series.map:

d = {x: b for a, b in zip(df2['groups'], df2['ids']) for x in a}
print (d)
{'A': 1, 'D': 1, 'F': 1, 'C': 2, 'E': 2, 'B': 3, 'K': 3, 'L': 3}

df1 = pd.DataFrame({'col':list('ABCD')})
df1['ids'] = df1['col'].map(d)
print (df1)
  col  ids
0   A    1
1   B    3
2   C    2
3   D    1

If data are Series:

s = pd.Series(list('ABCD'))

df3 = s.to_frame('col')
df3['ids'] = df3['col'].map(d)
print (df3)
  col  ids
0   A    1
1   B    3
2   C    2
3   D    1

Upvotes: 0

iamklaus
iamklaus

Reputation: 3770

using pd.unstack and pd.merge

ids = df2.pop('ids')
temp_df2 = pd.DataFrame(df2['groups'].tolist(),ids).unstack().reset_index(name=1)

print(df.merge(temp_df2, on=1).drop(['level_0'],1))

   1  ids
0  A    1
1  B    3
2  C    2
3  D    1

Upvotes: 1

Related Questions