Reputation: 67
How do I do the following in Python: for me to compare a text file and excel file to see if they contain the same data? I have the following code (bottom) setup where it looks for just the model and if it is in the text file, it prints out the model, brand and type is in there but sometimes there are multiple same models (don't ask why) so how do I output that?
I would rather have it take one line/cell from the excel sheet and compare it to each line of the text file and print out both lines if it does contain the same data but I am unsure how to go about doing that. Also, if there is a blank line in excel, how would I go about just skipping it?
I have a text file with the following
Toyota sedan Corrola
Honda sedan Accord
Honda SUV CR-V
Toyota sedan Camry
Toyota sedan Avalon
Honda SUV Camry
and an excel file with the following:
Code:
import pandas as pd
file1 = 'cars.xls'
file2 = open("cars.txt", "r")
df = pd.read_excel(file1, header=0)
readfile = file2.read()
for count in range(len(df['Brand'])):
if not str(df['Brand'][count]):
print("Blank line" + str(count))
print()
continue
model = str(df['model'][count])
model2 = model.replace(' ', '') # get rid of spaces in case there are any
if model2 in readfile:
print(str(df['Brand'][count]) + str(df['type'][count]) + str(df['model'][count]) + " is in both Excel sheet and text file")
desired output: (no particular order) Toyota sedan Corrola is in both Excel sheet and text file Honda SUV CR-V is in both Excel sheet and text file Toyota sedan Camry is in both Excel sheet and text file Honda SUV Camry is in both Excel sheet and text file Nissan sedan Altima is not in both Excel sheet and text file
etc etc
Upvotes: 0
Views: 1750
Reputation: 270
Turn the contents of your text file into a dataframe, then check if they are equal to each other. You can adapt something like this:
df = pd.DataFrame({"Brand": ["Toyota","Honda",0,"Nissan"],
"type": ["sedan", "SUV", 0, "sedan"],
"model": ["Camry", "CR-V", 0, "Altima"]})
df_2 = pd.DataFrame({"Brand": ["Toyota","Honda","Honda","0"],
"type": ["sedan", "sedan", "SUV", "0"],
"model": ["Honda", "SUV", "CR-V", "0"]})
df == df_2
#output
Brand type model
0 True True False
1 True False False
2 False False False
3 False False False
Upvotes: 1