user3643528
user3643528

Reputation: 53

Looking up for list of variable values in a column in a separate dataframe

I have a dataframe1 in the following format where I have ID and then ColumnA is a list of values corresponding to that ID:

ID   ColumnA 

1     [2375, 12329, 2742, 3503, 2496, 19572, 11198]

2     [5096, 5108, 5117, 5721, 5738, 5740, 6174, 617]

3     [11579, 17340, 4466, 11784, 24543, 22441, 2484]

4     [5146, 5461, 5528, 5536, 5617, 6081, 6315, 644]

5     [18645, 7211, 16336, 22766, 11189, 12652, 2178]

The dataframe2 is of the below format:

ColumnA Value

2375    100

5096    839

7211    116

8356    365

9999    427

I want to create a new column in dataframe1 which will lookup for each individual value corresponding to ColumnA variable from the second dataframe.

I am trying to do something like:

dataframe1['ColumnA'].apply(lambda x : np.array(dataframe2 .set_index('[ColumnA]').loc[x] if not np.isnan(x).any() else np.array([]) ))

It throws an error saying: 'None of [[53528]] are in the [index]'

Is there a way I can get rid of this error?

My dataframe1 expected output is of the below format:

ID    ColumnA                                               ColumnB

1     [2375,12329,2742,3503,2496,19572,11198]       [100,877, 1,2,3,4,5]

2     [5096,5108,5117,5721,5738,5740,6174,617]      [839,6,7,8,9,10,11,18]

3     [11579,17340,4466,11784,24543,22441,2484]     [13,14,15,16,17,18,19]

4     [5146,5461,5528,5536,5617,6081,6315,644]      [78,67,45,34,23,12,45,78]

5     [18645,7211,16336,22766,11189,12652,2178]     [2,116,56,45,1,67,56]

Since all the data is something I can't share, most of the values in the expected output might not come as output if you try on this sample dataset

Upvotes: 1

Views: 39

Answers (1)

Zero
Zero

Reputation: 76927

IIUIC, you can do

In [989]: mapper = df2.set_index('ColumnA')['Value'].to_dict()

In [990]: df1.ColumnA.apply(lambda x: [mapper.get(v) for v in x])
Out[990]:
0           [100, None, None, None, None, None, None]
1     [839, None, None, None, None, None, None, None]
2          [None, None, None, None, None, None, None]
3    [None, None, None, None, None, None, None, None]
4           [None, 116, None, None, None, None, None]
Name: ColumnA, dtype: object

Details

In [991]: mapper
Out[991]: {2375: 100, 5096: 839, 7211: 116, 8356: 365, 9999: 427}

Upvotes: 2

Related Questions