Reputation: 1096
I have a parent table df0
, with two levels code Code1
, Code2
, and Message
. Multiple child tables as df1
which stores lower level Code2
and Messages
. As you can see, "df1"
is indicated as index in table df0
.
import pandas as pd
s1 = pd.Series([1, 0, 'A'])
s2 = pd.Series([2, 'df1', 'B'])
s3 = pd.Series([3, 0, 'C'])
df0 = pd.DataFrame([list(s1), list(s2), list(s3)], columns = ["Code1", "Code2", "Message"])
df0
Code1 Code2 Message
0 1 0 A
1 2 df1 B
2 3 0 C
s4 = pd.Series([0, 'B1'])
s5 = pd.Series([1, 'B2'])
s6 = pd.Series([2, 'B3'])
df1 = pd.DataFrame([list(s4), list(s5), list(s6)], columns = ["Code2", "Message"])
df1
Code2 Message
0 0 B1
1 1 B2
2 2 B3
I want to have a larger table essential put df1
back to df0
and make a flat merged table. Thank you for your help.
Code1 Code2 Message
0 1 0 A
1 2 B
2 2 0 B1
3 2 1 B2
4 2 2 B3
5 3 0 C
Upvotes: 2
Views: 486
Reputation: 1670
Use merge
. The below provides an intermediate result that can be further processed to obtain the desired output.
import pandas as pd
import numpy as np
print(pd.__version__)
s1 = pd.Series([1, 0, 'A'])
s2 = pd.Series([2, 'df1', 'B'])
s3 = pd.Series([3, 0, 'C'])
df0 = pd.DataFrame([list(s1), list(s2), list(s3)], columns = ["Code1", "Code2", "Message"])
s4 = pd.Series([0, 'B1'])
s5 = pd.Series([1, 'B2'])
s6 = pd.Series([2, 'B3'])
df1 = pd.DataFrame([list(s4), list(s5), list(s6)], columns = ["Code2", "Message"])
join_key = 'df1'
df1[join_key] = join_key
df2 = pd.concat([df0[df0['Code2'] == join_key], df1], sort=True)
result = df0.merge(df2, left_on=['Code2'], right_on=['df1'], how='outer', suffixes=('_l', '_r'))
print (result)
Result
0.24.2
Code1_l Code2_l Message_l Code1_r Code2_r Message_r df1
0 1.0 0 A NaN NaN NaN NaN
1 3.0 0 C NaN NaN NaN NaN
2 2.0 df1 B NaN 0 B1 df1
3 2.0 df1 B NaN 1 B2 df1
4 2.0 df1 B NaN 2 B3 df1
5 NaN NaN NaN 2.0 df1 B NaN
Upvotes: 3