Reputation: 2747
I have a list that contains three pandas DataFrames. All the DataFrames have the exact column names and have the same length. I would like to compare all the entries of a specific column for in each DataFrame. Assuming that the List has:
List=[df1,df2,df3].
and each dataFrame has the following structure. df1 has the structure
column1 column2 column3
4 3 4
4 5 7
7 6 6
8 6 4
df2 has the structure
column1 column2 column3
4 3 4
7 5 7
7 6 5
8 6 4
df3 has the structure
column1 column2 column3
4 3 5
4 1 7
7 6 6
8 6 4
I would like to compare the content of df1 column1 and column2(for each row) with the contain df2 (column1 and column2) and df3 (column1 and column2)
I wrote something thought about something like this:
for i in range(len(List)):# iterate through the list
for j in range(len(List[0].index.values)):# iterate through the the whole dataFrame
#I would like to so something like: if df1[column1][row1]=df2[column1][row1] then do ....
# now i dont know how to iterate through all the dataFrames simulatanously to compare the content of of column 1 and column 2(for each row k) of df1 with the content of column 1 and column 2 of df2 and column 1 and column 2 of df3.
I am stuck there
Upvotes: 1
Views: 129
Reputation: 548
First, create dataframes with the data provided
import pandas as pd
df1 = pd.DataFrame({
'column1': [4,4,7,8],
'column2': [3,5,6,6],
'column3': [4,7,6,4]
})
print(df1)
# column1 column2 column3
# 0 4 3 4
# 1 4 5 7
# 2 7 6 6
# 3 8 6 4
df2 = df1.copy()
df2['column1'][1] = 7
df2['column3'][2] = 5
print(df2)
# column1 column2 column3
# 0 4 3 4
# 1 7 5 7
# 2 7 6 5
# 3 8 6 4
df3 = df1.copy()
df3['column2'][1] = 1
df3['column3'][0] = 5
print(df3)
# column1 column2 column3
# 0 4 3 5
# 1 4 1 7
# 2 7 6 6
# 3 8 6 4
Then, to get a dataframe of the same shape, with a boolean value indicating which entries are equal in both dataframes
print(df1.eq(df2))
# column1 column2 column3
# 0 True True True
# 1 False True True
# 2 True True False
# 3 True True True
To get a series of booleans indicating for which columns all the corresponding rows are equal in both dataframes
print(df1.eq(df2).all())
# column1 False
# column2 True
# column3 False
# dtype: bool
To get a series of booleans indicating for which rows all the corresponding columns are equal in both dataframes
print(df1.eq(df2).all(axis='columns'))
# 0 True
# 1 False
# 2 False
# 3 True
# dtype: bool
To get a single boolean indicating wheter all corresponding entries are equal in both dataframes
print(df1.equals(df2))
# False
If you need to combine every pair of dataframes and compare them, you can use
from itertools import combinations
List = [df1, df2, df3]
for a, b in combinations(enumerate(List, 1), 2):
print(f'df{a[0]}.equals(df{b[0]}): ', a[1].equals(b[1]))
# df1.equals(df2): False
# df1.equals(df3): False
# df2.equals(df3): False
Upvotes: 0