Reputation: 820

Create new column based on corresponding values in Pandas dataframe

I have a dataframe like so:

id_a    id_b     col_a
  NaN    NaN     NaN
  0     133     [23]
  7     191     [10,7]
  10    303     [1]
  23    200     [0,7,10]

I need to create a new column like so:

id_a    id_b     col_a      col_b
  NaN    NaN     NaN          NaN
  0     133     [23]         [200] 
  7     191     [10,7]       [303,191]
  10    303     [1]          [101]
  23    200     [0,7,10]     [133,191,303]

The logic is that I am returning the ids from id_b based on a corresponding value in id_a.

Example: for the first list [1,0,7];

I return 101, from id_b since the corresponding value with 1 in id_b is 101. Then 133 since 0 matches with 133 and finally 191 since 7 is the corresponding value here.

I have tried using .loc to capture just those rows but haven't got far. Any help is greatly appreciated

Upvotes: 1

Answers (2)

anky

Reputation: 75080

With df.stack():

d = dict(zip(df['id_a'],df['id_b']))
df['col_b'] = (pd.DataFrame(df['col_a'].tolist()).replace(d)
            .stack().groupby(level=0).agg(list))

Or: apply:

df['col_b'] = df['col_a'].apply(lambda x: [d.get(i) for i in x])
print(df)

   id_a  id_b       col_a            col_b
0     1   101   [1, 0, 7]  [101, 133, 191]
1     0   133        [23]            [200]
2     7   191     [10, 7]       [303, 191]
3    10   303         [1]            [101]
4    23   200  [0, 7, 10]  [133, 191, 303]

Upvotes: 3

BENY

Reputation: 323226

IIUC explode then map

df.col_a.explode().map(dict(zip(df.id_a,df.id_b))).groupby(level=0).agg(list)
0    [101, 133, 191]
1              [200]
2         [303, 191]
3              [101]
4    [133, 191, 303]
Name: col_a, dtype: object

Upvotes: 3

Create new column based on corresponding values in Pandas dataframe

Answers (2)

Related Questions