Reputation: 1083
I have two data frames as follow.
df1 :
A | B | C | D | E | F | |
---|---|---|---|---|---|---|
1 | 2 | 3 | x | y | z |
df2 :
A | B | C | G | H | I | |
---|---|---|---|---|---|---|
4 | 5 | 6 | p | q | r |
I wanted to make three data frames using these two like this.
ndf1 :
A | B | C | ID | |
---|---|---|---|---|
1 | 2 | 3 | id_0 | |
4 | 5 | 6 | id_1 |
ndf2 :
ID | D | E | F | |
---|---|---|---|---|
id_0 | x | y | z |
ndf3 :
ID | G | H | I | |
---|---|---|---|---|
id_1 | p | q | r |
My questions are, Is this workflow possible with Python and Pandas? and What would be the best approach for that?
Upvotes: 0
Views: 137
Reputation: 26221
I will differ a bit from your formulation, in order to:
comm
command.df1
, resp. df2
, to indicate where the rows come from (instead of creating id_0
etc.) If necessary, the index of df1
and df2
can be set to uniquely indicate provenance.With this, you can do:
ix1 = df1.columns.difference(df2.columns) # unique to df1
ix2 = df2.columns.difference(df1.columns) # unique to df2
ix3 = df1.columns.intersection(df2.columns) # common
u1 = df1[ix1]
u2 = df2[ix2]
common = pd.concat([df1[ix3], df2[ix3]])
On your example data:
>>> u1
D E F
0 x y z
>>> u2
G H I
0 p q r
>>> common
A B C
0 1 2 3
0 4 5 6
Addendum
Here is an example of how to modify the index of df1
and df2
to indicate unambiguously provenance:
index1 = [f'df1_{i}' for i in df1.index]
index2 = [f'df2_{i}' for i in df2.index]
u1 = df1.set_axis(index1)[ix1]
u2 = df2.set_axis(index2)[ix2]
common = pd.concat([df1.set_axis(index1)[ix3], df2.set_axis(index2)[ix3]])
And now:
>>> u1
D E F
df1_0 x y z
>>> u2
G H I
df2_0 p q r
>>> common
A B C
df1_0 1 2 3
df2_0 4 5 6
Upvotes: 1
Reputation: 85
You could use pd.DataFrame.merge to merge df1 and df2. Then simply drop the unwanted columns from merged df to create ndf1
To create the ndf2 and ndf3 just drop the unwanted A, B, and C columns from df1 and df2 using pd.DataFrame.drop
FYI
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.merge.html
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.drop.html
Upvotes: 0