Reputation:
I'm trying to take a .tgz
archive and convert it to byte array that I can later use.
This is my full code:from argparse import
from argparse import ArgumentParser
import os
def main():
parser = ArgumentParser()
parser.add_argument("-p", "--paylaod", dest="payloadFile", help="Input payload", nargs='?')
args = parser.parse_args()
if args.payloadFile is not None:
payload = args.payloadFile
print (payload)
with open("MyPlugin.tgz", "rb") as binary_file:
# Read the whole file at once
data = binary_file.read()
print(data)
if __name__ == "__main__":
main()
This prints out this data:
MyPlugin2.tgz
b'\x1f\x8b\x08\x00\xf9\xd0,]\x00\x03\xed\xd2AK\xc30\x18\x06\xe0\x9e\xfb+r\x10\xaa\x07\xbbt]\xd7\xc3\xd49\x06\x82\x87\r\x0f\xeau\x84-\xac\x85\xb4)i\n\x16\xf1\xbf\x9b\xb4l\x87A*\xc2D\xc4\xf79,c\xdf\xfb5Y\xbf,t\xc1r1Z\xb5O\xa2\xd9\xe7\xe5\xc8\xfb\x01\xd4H\x93\xa4[\x8d\xd3\xb5\xfb\x1e\xc5I\x14G\x934\xa6\xa9G\xa3q\x12\xc7\x1eI~\xe20\xa7\x9aZ3E\x88\xa7\xa4\xd4C\xb9\xaf\xea\x7f\xd4\xe2d\xfe\xfd\x12VYu\xbe=\xec\x80\xa7\x93\x89k\xfe\x11\x1d\xc7\xc7\xf9\xa7Il\xe6?I\xa8\x99?=\xdf\x11\xdc\xfe\xf9\xfco\xe6f\xd4\xfeV\xb0\xba&\xfd]\xd8\x1c\xee\xc2\xa6_\x08\x7f\xd3\xbc\xdc\x1d\xcbKYj%\x85\xe0\xea\x10x\xf7\x89Q)\xa9\xf9V\xf3\x1d\xb9\xd8T]\xe1\xa1\x11b\xcd\nn\x8a\xb7$8<6\x989\xe2KYT\xaclm\xb6\xdf\xca\x99\\4:\x93\xca\x06\x9fe\xc1\x05\xc9\xd8}\xc6B\xd65\x85[Y\x043\xd3\xe6\xdc\xa5jU\xbe\xcf\xf4P\xbb\xa3\xf7E\t\xdb\xe5\xac\xaf\xcd\x0f\xf5`\xe2\x95\xab:\x97\xa5\xcdD!\r\xa93\xb8\x92\xbbFp\x9b\x1b|\x13\xdd;\xeb\xfe\xca4\x8c\xc2)\xb1\x9f\xa93\xfdXT\x82\x17\xbc\xd4\xf6\x90L)\xd6^\x06\xac\xd4\xf9u]\xb1"\xb8\x9a\xf9\x1f\xfe\xfc\xce\xff\xed+\t\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xdf\xf0\t\x87\x06!E\x00(\x00\x00'
As you can see, the outputted array is filled with ilegal signs - meaning I must be doing something wrong.
What is the most idomatic way to convert various binary files to hex / bytearray /shellcode that I can later use in my code?
Upvotes: 0
Views: 1578
Reputation: 5073
The array is not filled with "illegal signs", it just contains binary data, and the string representation of binary data does not make much sense. The output you get is quite expected.
To answer your question: the variable data
already is a bytes
object, and does not need to be converted to another format in order to process (read) it. However, if you want, you can convert the data to a bytearray
as follows:
arr = bytearray(data)
However, it will not be printed more nicely than the bytes
object.
Also note that, given the '.tgz' filename, your data is probably compressed, so you'll likely need to decompress it. You can use the standard library gzip
for that.
import gzip
with gzip.open('MyPlugin.tgz', 'rb') as file:
data = bytearray(file.read())
What is the most idomatic way to convert various binary files to hex / bytearray /shellcode that I can later use in my code?
This question is difficult to answer, because we don't know what you mean with "use". Generally, hex is not a really useful format for computers, but mostly used as debug output. How you would want to convert the data really depends on the format of the data and how you want to process it.
I think that the bytes
type (like the output of file.read()
when file
is opened in binary mode) is a good intermediate format that you could use to pass to other functions. The bytes
type is a built-in type that is natively supported by Python and already used as input or output format for many existing functions. For example, it can be directly written to any binary output stream (such as a file or socket) using file.write()
and it can be easily converted to for example:
str
, using data.decode()
bytearray
, using bytearray(data)
numpy.array
using numpy.frombuffer()
base64.b64encode()
bytes
object, using zlib.compress()
Upvotes: 2