Reputation: 511
I want to compare two text files with same name and same relative path inside two different zip files using python.
I have been trying to search various ways and found none of the top solutions available work in my case.
My code:
from zipfile import ZipFile
from pathlib import Path
with ZipFile(zip_path1) as z1, ZipFile(zip_path2) as z2:
file1_paths = [Path(filepath) for filepath in z1.namelist()]
file12_paths = [Path(filepath) for filepath in z12.namelist()]
cmn = list(set(file1_paths ).intersection(set(file12_paths )))
common_files = [filepath for filepath in cmn if str(filepath).endswith(('.txt', '.sh'))]
for f in common_files:
with z1.open(f, 'r') as f1, z2.open(f, 'r') as f2:
if f1.read() != f2.read(): # Also used io.TextIOWrapper(f1).read() here
print('Difference found for {filepath}'.format(filepath=str(f))
Note:
I have used pathlib for the paths here. In the line with z1.open(f, 'r')...
if I use pathlib paths instead of hardcoding the path, I am getting <class 'KeyError'>: "There is no item named WindowsPath('SomeFolder/somefile.txt') in the archive"
.
Moreover, even if I hardcode the path, the file read buffer that is coming for comparison is always coming empty. So the comparison is not actually working in this case.
I am stuck in this curious case and any help is much appreciated.
Upvotes: 0
Views: 407
Reputation: 11060
You should be able to achieve this without using Path
, since the paths are specific to the zipfile and don't need to be treated in an os-specific way. The strings returned by namelist()
can be used for both comparison and as arguments to open()
as follows:
from zipfile import ZipFile
with ZipFile(zip_path1) as z1, ZipFile(zip_path2) as z2:
common_files = [x for x in set(z1.namelist()).intersection(set(z2.namelist())) if x.endswith('.txt') or x.endswith('.sh')]
# print(common_files)
for f in common_files:
with z1.open(f) as f1, z2.open(f) as f2:
if f1.read() != f2.read():
print('Difference found for {filepath}'.format(filepath=str(f)))
Upvotes: 1