Reputation: 278
I am trying to select a subset of a DataFrame based on the columns of another DataFrame.
The DataFrames look like this:
a b c d
0 0 1 2 3
1 4 5 6 7
2 8 9 10 11
3 12 13 14 15
a b
0 0 1
1 2 3
2 4 5
3 6 7
4 8 9
I want to get all rows of the first Dataframe for the columns which are included in both DataFrames. My result should look like this:
a b
0 0 1
1 4 5
2 8 9
3 12 13
Upvotes: 5
Views: 12357
Reputation: 119
import pandas as pd
data1=[[0,1,2,3,],[4,5,6,7],[8,9,10,11],[12,13,14,15]]
data2=[[0,1],[2,3],[4,5],[6,7],[8,9]]
df1 = pd.DataFrame(data=data1,columns=['a','b','c','d'])
df2 = pd.DataFrame(data=data2,columns=['a','b'])
df1[(df1.columns) & (df2.columns)]
Upvotes: 3
Reputation: 164793
You can use pd.Index.intersection
or its syntactic sugar &
:
intersection_cols = df1.columns & df2.columns
res = df1[intersection_cols]
Upvotes: 6