Reputation: 2080
I have a data frame as following,
data0 = pd.DataFrame({'PE03': ['CTA15', 'CTA15', 'AC007', 'AC007', 'AC007'],
'AL01': [22, 22, 2, 2, 2],
'AL02':['12', '13', '14', '15', '16'],
'AL03' : [2, 2, 3, 3, 3],
'PL03' :[3,7,8,9,1]})
data0
AL01 AL02 AL03 PE03 PL03
0 22 12 2 CTA15 3
1 22 13 2 CTA15 7
2 2 14 3 AC007 8
3 2 15 3 AC007 9
4 2 16 3 AC007 1
And another data farme as,
data1 = pd.DataFrame({'main': ['CTA15', 'CTA15', 'AC007', 'AC007', 'AC007'],
'sub': ['LUD1', 'KLM1', 'SAP1', 'SAP2', 'SOS'],
'samples':['PE03', 'AL01', 'AL02', 'AL03', 'PL03']})
data1
main samples sub
0 CTA15 PE03 LUD1
1 CTA15 AL01 KLM1
2 AC007 AL02 SAP1
3 AC007 AL03 SAP2
4 AC007 PL03 SOS
And what I wanted is to look for the matching columns from data0
into data1
in column samples and then append the values from sub to columns in data0
.
The final output should look like this,
data_2
AL01_KLM1 AL02_SAP1 AL03_SAP2 PE03_LUD1 PL03_SOS
0 22 12 2 CTA15 3
1 22 13 2 CTA15 7
2 2 14 3 AC007 8
3 2 15 3 AC007 9
4 2 16 3 AC007 1
Any help or suggestions would be great.
Upvotes: 0
Views: 133
Reputation: 153510
You can merge the tables to create the concatenated column names and reshape:
df1 = data0.T.merge(data1, left_index=True, right_on='samples')
df1 = df1.assign(colname=df1['samples']+'_'+df1['sub'])
data_2 = df1.set_index('colname').T.drop(data1.columns).rename_axis(None,1))
Output:
AL01_KLM1 AL02_SAP1 AL03_SAP2 PE03_LUD1 PL03_SOS
0 22 12 2 CTA15 3
1 22 13 2 CTA15 7
2 2 14 3 AC007 8
3 2 15 3 AC007 9
4 2 16 3 AC007 1
Upvotes: 1
Reputation: 7536
I think this is what you want:
sub = data1.set_index('samples').loc[data0.columns]['sub']
data0.columns = [x + '_' + y for x, y in zip(data0.columns, sub)]
It just finds the matching sub
value for each sample
and appends it to the column name in data0
.
Upvotes: 1