Mozgawa
Mozgawa

Reputation: 125

How to know format of file from a bytes-like object?

ORACLE database that I use stores files in the PDF or ZIP format in the BLOB type. I want to save these files. However, I do not know how to recognize when it is a PDF and when it is ZIP? Is it possible to check which file format BLOB stores inside?

Below is a simple write_file method for saving a file:

def write_file(data, filename):
    with open(filename, 'wb') as f:
        f.write(data)

Here, I fetch the appropriate BLOB with the cursor and I use the write_file method to save the file:

firstRow = cur.fetchone()
write_file(firstRow[0].read(), "blah.zip")

How to recognize when it will be zip and when it will be pdf?

Upvotes: 1

Views: 2105

Answers (1)

user12195782
user12195782

Reputation:

You can try to check the file signatures by inspecting the bytes you read.

According to this: https://en.wikipedia.org/wiki/List_of_file_signatures

1) A zip file starts with "50 4B 03 04" or "50 4B 05 06" or "50 4B 07 08"

2) A pdf file starts with: "25 50 44 46 2d"

So you can check the first few bytes and check if those are equal to the file signatures - and figure out the file type based on that.

Upvotes: 4

Related Questions