KinWolf
KinWolf

Reputation: 175

how to compare columns in dataframe pandas?

I have two dataframes with different dimensions (i.e):

df1
      A    B     
0     1    10
1     2    11
2     3    12
3     4    13
4     5    14

df2
      A    B     C
0     1    10    10
1     3    12    12
2     4    13    13

I know how to retrieve the elements that are existing in both dataframes :

dfnew = df1.loc[df1.set_index(list(df1.columns)).index.isin(df2.set_index(list(df2.columns)).index)]

On the other hand, what I want, retrieve only the names of the columns that are existing in both dataframes and storing them in a variable, such as this example :

a= [ A, B, C]

Upvotes: 1

Views: 95

Answers (2)

jezrael
jezrael

Reputation: 863741

I think you need union if need only union of columns names:

df1.columns.union(df2.columns).tolist()

Sample:

df1 = pd.DataFrame(columns=['A', 'B'])
df2 = pd.DataFrame(columns=['A', 'B', 'C'])

L = df1.columns.union(df2.columns).tolist()
print (L)
['A', 'B', 'C']

Faster solution with numpy.union1d:

L = np.union1d(df1.columns, df2.columns).tolist()
print (L)
['A', 'B', 'C']

Upvotes: 1

Allen Qin
Allen Qin

Reputation: 19957

#use set on df columns to get the intersections:
list(set(df1.columns).union(set(df2.columns)))

Upvotes: 1

Related Questions