Reputation: 1249
We are archiving large and complex directories for clients into PAX format tarballs. Upon inspection with 7-Zip File Manager, there is a PaxHeaders.PID field (we create it under GNU/Linux using GNU tar and environment variable POSIXLY_CORRECT=1
set) which stores extended information.
We are considering adding --pax-option=exthdr.name=%d/PaxHeaders/%f,delete=atime,delete=ctime
to the tar arguments to make the archives more reproducible as suggested in this article.
My question is, regardless of using %d/PaxHeaders/%f
or the default %d/PaxHeaders.%p/%f
, what if real files with the exact same naming scheme already exist in the source directory? Will that cause problems in the tarball? Additionally, is it against POSIX 'correctness' to delete the atime and ctime? We want it to be reproducible but also POSIXly correct, if this is even possible (and if not, as close to the two as possible).
The most information we can find as to why the PID is even included is in the PAX manual on Opengroup's site:
IEEE Std 1003.1-2001/Cor 1-2002, item XCU/TC1/D6/35 is applied, adding the process ID of the pax process into certain fields. This change provides a method for the implementation to ensure that different instances of pax extracting a file named /a/b/foo will not collide when processing the extended header information associated with foo.
however this makes little sense to me. We are only using one process to create the PAX tarball. We don't know how the client will use it.
Upvotes: 0
Views: 69