BrnCPrz
BrnCPrz

Reputation: 35

Return the corresponding value in multiple columns Pandas

I've been trying to solve this in a clean manner, but have been failing remarkably on this task.

Going strait to the point... I have 2 dataframes:

df1
     anID   siID   daID
0       A      0      0
1       D      0      0
2       E      0      0
3       F      0      0
4       G      A      D
5       H      E      D
6       B      0      0
7       I      E      M
8       L      F      0
9       M      B      H
10      N      A      D

df2
   map_id  renum_id
0       A         1
1       C         2
2       B         3
3       E         4
4       D         5
5       F         6
6       Q         7
7       V         8
8       H         9
9       G        10
10      N        11

Basically, df1 is a dataframe with identification codes, df2$renum_id has the new codes to replace the existing ones and df2$map_id is the map which to guide the recoding of df1's columns.

Now I need a clean and light/fast way to get the corresponding renum_id value (using map_id as a key) for all 3 columns in df1 (anID, siID, daID). Basically, I want it to give me the corresponding renum_ID for each letter, and keep the 0 (zero) for zeros in df1.

So, I need a df3 like this:

     anID   siID   daID  anID/  siID/  daID/
0       A      G      0      1     10      0   
1       D      E      A      5      4      1      
2       E      0      0      4      0      0
3       F      B      D      6      3      5
4       G      A      D     10      1      5
5       H      E      D      9      4      5          
6       B      0      0      2      0      0     
7       N      A      D     11      1      5

It's probably something really simple, but I've been trying with pd.merge and couldn't come up with something to solve my problem.

Dataframes have something around hundreds of thousands of lines, that's why I insist in the need of a "light" way to solve it.

Thanks everyone in advance.

Upvotes: 2

Views: 64

Answers (1)

jezrael
jezrael

Reputation: 862406

I think you can first create dictionary with df2 by to_dict and then replace values in df1 by dictionary d. Last concat new DataFrame df3 to df1 by columns:

d = df2.set_index('map_id').to_dict()
print d['renum_id']
{'A': 1, 'C': 2, 'B': 3, 'E': 4, 'D': 5, 'G': 10, 'F': 6, 'H': 9, 'N': 11, 'Q': 7, 'V': 8}

df3 = df1.replace(d['renum_id'])
df3.columns = [col + '/' for col in df3.columns]
print df3

   anID/ siID/ daID/
0      1     0     0
1      5     0     0
2      4     0     0
3      6     0     0
4     10     1     5
5      9     4     5
6      3     0     0
7      I     4     M
8      L     6     0
9      M     3     9
10    11     1     5

print pd.concat([df1, df3], axis=1)

   anID siID daID anID/ siID/ daID/
0     A    0    0     1     0     0
1     D    0    0     5     0     0
2     E    0    0     4     0     0
3     F    0    0     6     0     0
4     G    A    D    10     1     5
5     H    E    D     9     4     5
6     B    0    0     3     0     0
7     I    E    M     I     4     M
8     L    F    0     L     6     0
9     M    B    H     M     3     9
10    N    A    D    11     1     5

Upvotes: 1

Related Questions