kyle k
kyle k

Reputation: 5522

Python securely remove file

How can I securely remove a file using python? The function os.remove(path) only removes the directory entry, but I want to securely remove the file, similar to the apple feature called "Secure Empty Trash" that randomly overwrites the file.

What function securely removes a file using this method?

Upvotes: 9

Views: 12191

Answers (8)

anonguest222
anonguest222

Reputation: 1

import os
import random
import time

def secure_delete(file_path, overwrite_method='random', passes=random.randint(100, 10000)):
    try:
        # Get the current timestamp
        timestamp = int(time.time())
        # Set newname var 
        newname = ""
        
        # Open the original file in binary read mode
        with open(file_path, 'rb') as orig_file:
            # Read the file content
            content = orig_file.read()
        
        # Close the original file
        orig_file.close()
        
        # Perform the overwrite operation
        with open(file_path, 'wb') as file_to_overwrite:
            if overwrite_method == 'zeros':
                for i in range(passes):
                    file_to_overwrite.write((len(content) * "0").encode())
            elif overwrite_method == 'random':
                file_to_overwrite.write(bytes(random.randint(0, 255) for _ in range(len(content))))
            else:
                raise ValueError("Invalid overwrite method. Choose 'zeros' or 'random'.")
        
        
        # rename and delete
        
        for i in range(len(file_path)):
            newname = newname + f"{random.randint(0, 9)}"
        os.rename(file_path, newname)
        # Delete the original file
        #os.remove(newname)
        
        print(f"File {file_path} has been securely deleted.")
    
    except Exception as e:
        print(f"An error occurred during secure deletion: {e}")

# Example usage
if __name__ == "__main__":
    file_to_delete = input("Enter the path to the file you want to delete:\n>>> ")
    overwrite_method = input("Choose the overwrite method (random / zeros):\n>>> ").lower()
    while overwrite_method not in ['random', 'zeros']:
        overwrite_method = input("Invalid input. Please choose 'random' or 'zeros':\n>>> ").lower()
    passes = int(input("Enter the number of overwrite passes:\n>>> "))
    
    secure_delete(file_to_delete, overwrite_method=overwrite_method, passes=passes)

Upvotes: 0

ideasman42
ideasman42

Reputation: 48218

This secure delete function uses un-buffered file access and writes random data using fixed size chunks to prevent memory errors with very large files.

import os

def secure_delete(
    filepath: str,
    passes: int = 1,
    chunk_size: int = 1 << 18,  # 256KB.
) -> None:
    with open(filepath, "ba+", buffering=0) as fh:
        path_size = fh.tell()
    with open(filepath, "br+", buffering=0) as fh:
        for _ in range(passes):
            fh.seek(0)
            offset_next = 0
            while (offset := offset_next) < path_size:
                offset_next = min(offset + chunk_size, path_size)
                fh.write(os.urandom(offset_next - offset))
            assert offset == path_size
    os.remove(filepath)

Upvotes: 0

Maxime
Maxime

Reputation: 1222

If you are looking for performance, shred is faster.
srm does 38 passes, when 3 passes is considered secure (to some standards)
The Python implementation is much slower.
On my tests (time in seconds for 50 files, on a slow WSL machine):
shred: 2.66
pure Python: 22.70
SRM: 34.54

subprocess.check_call(['shred' ,'-n', '2', '-z', '-u', file_path])

Upvotes: 0

kindall
kindall

Reputation: 184365

You can very easily write a function in Python to overwrite a file with random data, even repeatedly, then delete it. Something like this:

import os

def secure_delete(path, passes=1):
    with open(path, "ba+") as delfile:
        length = delfile.tell()
    with open(path, "br+") as delfile:
        for i in range(passes):
            delfile.seek(0)
            delfile.write(os.urandom(length))
    os.remove(path)

Shelling out to srm is likely to be faster, however.

Upvotes: 7

rafagarci
rafagarci

Reputation: 97

The answers implementing a manual solution did not work for me. My solution is as follows, it seems to work okay.

import os

def secure_delete(path, passes=1):
    length = os.path.getsize(path)
    with open(path, "br+", buffering=-1) as f:
        for i in range(passes):
            f.seek(0)
            f.write(os.urandom(length))
        f.close()

Upvotes: 0

phealy3330
phealy3330

Reputation: 11

So at least in Python 3 using @kindall's solution I only got it to append. Meaning the entire contents of the file were still intact and every pass just added to the overall size of the file. So it ended up being [Original Contents][Random Data of that Size][Random Data of that Size][Random Data of that Size] which is not the desired effect obviously.

This trickery worked for me though. I open the file in append to find the length, then reopen in r+ so that I can seek to the beginning (in append mode it seems like what caused the undesired effect is that it was not actually possible to seek to 0)

So check this out:

def secure_delete(path, passes=3):
with open(path, "ba+", buffering=0) as delfile:
    length = delfile.tell()
delfile.close()
with open(path, "br+", buffering=0) as delfile:
    #print("Length of file:%s" % length)
    for i in range(passes):
        delfile.seek(0,0)
        delfile.write(os.urandom(length))
        #wait = input("Pass %s Complete" % i)
    #wait = input("All %s Passes Complete" % passes)
    delfile.seek(0)
    for x in range(length):
        delfile.write(b'\x00')
    #wait = input("Final Zero Pass Complete")
os.remove(path) #So note here that the TRUE shred actually renames to file to all zeros with the length of the filename considered to thwart metadata filename collection, here I didn't really care to implement

Un-comment the prompts to check the file after each pass, this looked good when I tested it with the caveat that the filename is not shredded like the real shred -zu does

Upvotes: 1

Dima Tisnek
Dima Tisnek

Reputation: 11779

You can use srm, sure, you can always easily implement it in Python. Refer to wikipedia for the data to overwrite the file content with. Observe that depending on actual storage technology, data patterns may be quite different. Furthermore, if you file is located on a log-structured file system or even on a file system with copy-on-write optimisation, like btrfs, your goal may be unachievable from user space.

After you are done mashing up the disk area that was used to store the file, remove the file handle with os.remove().

If you also want to erase any trace of the file name, you can try to allocate and reallocate a whole bunch of randomly named files in the same directory, though depending on directory inode structure (linear, btree, hash, etc.) it may very tough to guarantee you actually overwrote the old file name.

Upvotes: 6

jh314
jh314

Reputation: 27812

You can use srm to securely remove files. You can use Python's os.system() function to call srm.

Upvotes: 14

Related Questions