Basj
Basj

Reputation: 46267

Rewrite a file in-place with Python

It might depend on each OS and also be hardware-dependent, but is there a way in Python to ask a write operation on a file to happen "in-place", i.e. at the same place of the original file, i.e. if possible on the same sectors on disk?

Example: let's say sensitivedata.raw, a 4KB file has to be crypted:

with open('sensitivedata.raw', 'r+') as f:  # read write mode
    s = f.read()
    cipher = encryption_function(s)  # same exact length as input s
    f.seek(0)
    f.write(cipher)  # how to ask this write operation to overwrite the original bytes?

Example 2: replace a file by null-byte content of the same size, to avoid an undelete tool to recover it (of course, to do it properly we need several passes, with random data and not only null-bytes, but here it's just to give an idea)

with open('sensitivedata.raw', 'r+') as f:
    s = f.read()
    f.seek(0)
    f.write(len(s) * '\x00')  # totally inefficient but just to get the idea
os.remove('sensitivedata.raw')

PS: if it really depends a lot on OS, I'm primarily interested in the Windows case


Side-quesion: if it's not possible in the case of a SSD, does this mean that if you once in your life wrote sensitive data as plaintext on a SSD (example: a password in plaintext, a crypto private key, or anything else, etc.), then there is no way to be sure that this data is really erased? i.e. the only solution is to 100% wipe the disk and fill it many passes with random bytes? Is that correct?

Upvotes: 3

Views: 454

Answers (1)

ShadowRanger
ShadowRanger

Reputation: 155536

That's an impossible requirement to impose. While on most spinning disk drives, this will happen automatically (there's no reason to write the new data elsewhere when it could just overwrite the existing data directly), SSDs can't do this (when they claim to do so, they're lying to the OS).

SSDs can't rewrite blocks; they can only erase a block, or write to an empty block. The implementation of a "rewrite" is to write to a new block (reading from the original block to fill out the block if there isn't enough new data), then (eventually, cause it's relatively expensive) erase the old block to make it available for a future write.

Update addressing side-question: The only truly secure solution is to run your drive through a woodchipper, then crush the remains with a millstone. :-) Really, in most cases, the window of vulnerability on an SSD should be relatively short; erasing sectors is expensive, so even SSDs that don't honor TRIM typically do it in the background to ensure future (cheap) write operations aren't held up by (expensive) erase operations. This isn't really so bad when you think about it; sure, the data is visible for a period of time after you logically erased it. But it was visible for a period of time before you erased it too, so all this is doing is extending the window of vulnerability by (seconds, minutes, hours, days, depending on the drive); the mistake was in storing sensitive data to permanent storage in the first place; even with extreme (woodchipper+millstone) solutions, someone else could have snuck in and copied the data before you thought to encrypt/destroy it.

Upvotes: 5

Related Questions