Printed path does not match path given to tar

Question

I feed a list of files that need to be backed up in a function that tars them.

The list is a result of a comparison of two text files that contain checksums.

How the files are made:

hash = hashlib.md5(path + '/' + file).hexdigest()
f.write('{} - {}'.format(hash, path + '/' + file) + '
')

How they are compared:

with open(tmpfile, 'r') as f1:
        with open(storagefile, 'r') as f2:
            diff = set(f1).difference(f2)

I get the following error when tarring:

[Errno 2] No such file or directory: '/XXX/XXX/XXXX/XXXX/Trash/files/hihi
'

Notice the ' and in the filename

If I print the path before tarring there is not trace of the ' and

/XXX/XXX/XXXX/XXXX/Trash/files/hihi

Does someone have an idea why this is happening or how to fix this? Maybe I should use a stream writer instead of having to rely on

Warren Weckesser · Accepted Answer

When the file is read with set(f1), the lines read from the file include the newlines (similar to f1.readlines()).

For example:

[5]: !cat foo.txt
foo
bar
baz

In [6]: with open('foo.txt', 'r') as f:
   ...:     s = set(f)
   ...:     

In [7]: s
Out[7]: {'bar
', 'baz
', 'foo
'}

There are many ways you could fix this. For example, use:

        diff = {name.rstrip('
') for name in set(f1).difference(f2)}

That should work fine if the files are always created using the code shown in the question. If eventually you might read files created elsewhere, you should be safe and strip the newline characters before putting the lines into sets. That will avoid the potential problem of a file not having a final newline.

Printed path does not match path given to tar

Answers (1)

Related Questions