Reputation: 175
I have two dataframes with different dimensions (i.e):
df1
A B
0 1 10
1 2 11
2 3 12
3 4 13
4 5 14
df2
A B C
0 1 10 10
1 3 12 12
2 4 13 13
I know how to retrieve the elements that are existing in both dataframes :
dfnew = df1.loc[df1.set_index(list(df1.columns)).index.isin(df2.set_index(list(df2.columns)).index)]
On the other hand, what I want, retrieve only the names of the columns that are existing in both dataframes and storing them in a variable, such as this example :
a= [ A, B, C]
Upvotes: 1
Views: 95
Reputation: 863741
I think you need union
if need only union of columns names:
df1.columns.union(df2.columns).tolist()
Sample:
df1 = pd.DataFrame(columns=['A', 'B'])
df2 = pd.DataFrame(columns=['A', 'B', 'C'])
L = df1.columns.union(df2.columns).tolist()
print (L)
['A', 'B', 'C']
Faster solution with numpy.union1d
:
L = np.union1d(df1.columns, df2.columns).tolist()
print (L)
['A', 'B', 'C']
Upvotes: 1
Reputation: 19957
#use set on df columns to get the intersections:
list(set(df1.columns).union(set(df2.columns)))
Upvotes: 1