how to compare columns in dataframe pandas?

Question

I have two dataframes with different dimensions (i.e):

df1
      A    B     
0     1    10
1     2    11
2     3    12
3     4    13
4     5    14

df2
      A    B     C
0     1    10    10
1     3    12    12
2     4    13    13

I know how to retrieve the elements that are existing in both dataframes :

dfnew = df1.loc[df1.set_index(list(df1.columns)).index.isin(df2.set_index(list(df2.columns)).index)]

On the other hand, what I want, retrieve only the names of the columns that are existing in both dataframes and storing them in a variable, such as this example :

a= [ A, B, C]

jezrael · Accepted Answer

I think you need union if need only union of columns names:

df1.columns.union(df2.columns).tolist()

Sample:

df1 = pd.DataFrame(columns=['A', 'B'])
df2 = pd.DataFrame(columns=['A', 'B', 'C'])

L = df1.columns.union(df2.columns).tolist()
print (L)
['A', 'B', 'C']

Faster solution with numpy.union1d:

L = np.union1d(df1.columns, df2.columns).tolist()
print (L)
['A', 'B', 'C']

how to compare columns in dataframe pandas?

Answers (2)

Related Questions