Reputation: 35
so, I have two dataframes and I'm attempting the match the column 'numbers' from dataframe 1 to the content of the full dataframe 2, and extract the column header and use that as a label/new column in dataframe 1. Struggling to build a pipeline for this though. I am working in Python.
dataframe 1
numbers |
---|
100 |
101 |
102 |
103 |
200 |
201 |
202 |
203 |
300 |
301 |
302 |
303 |
dataframe 2:
construction | fields | plates |
---|---|---|
100 | 200 | 300 |
101 | 201 | 301 |
102 | 202 | 302 |
103 | 203 | 303 |
output [with new column label matched from dataframe 2]:
numbers | label |
---|---|
100 | construction |
101 | construction |
102 | construction |
103 | construction |
200 | fields |
201 | fields |
202 | fields |
203 | fields |
300 | plates |
301 | plates |
302 | plates |
303 | plates |
Upvotes: 2
Views: 1854
Reputation: 120419
Use melt
to flatten your second dataframe then merge
it with your first dataframe:
>>> df1.merge(df2.melt(var_name='label', value_name='numbers'), on='numbers')
numbers label
0 100 construction
1 101 construction
2 102 construction
3 103 construction
4 200 fields
5 201 fields
6 202 fields
7 203 fields
8 300 plates
9 301 plates
10 302 plates
11 303 plates
After melt
, your second dataframe looks like:
>>> df2.melt(var_name='label', value_name='numbers')
label numbers
0 construction 100
1 construction 101
2 construction 102
3 construction 103
4 fields 200
5 fields 201
6 fields 202
7 fields 203
8 plates 300
9 plates 301
10 plates 302
11 plates 303
Another way with map
:
df1['label'] = df1['numbers'].map(df2.melt().set_index('value').squeeze())
print(df)
# Output
numbers label
0 100 construction
1 101 construction
2 102 construction
3 103 construction
4 200 fields
5 201 fields
6 202 fields
7 203 fields
8 300 plates
9 301 plates
10 302 plates
11 303 plates
Upvotes: 2