user11527868
user11527868

Reputation:

How to generate byte array from a file?

I'm trying to take a .tgz archive and convert it to byte array that I can later use.

This is my full code:from argparse import

from argparse import ArgumentParser
import os

def main():
    parser = ArgumentParser()
    parser.add_argument("-p", "--paylaod", dest="payloadFile", help="Input payload", nargs='?')
    args = parser.parse_args()

    if args.payloadFile is not None:
        payload = args.payloadFile
        print (payload)
        with open("MyPlugin.tgz", "rb") as binary_file:
            # Read the whole file at once
            data = binary_file.read()
            print(data)
if __name__ == "__main__":
    main()

This prints out this data:

MyPlugin2.tgz
b'\x1f\x8b\x08\x00\xf9\xd0,]\x00\x03\xed\xd2AK\xc30\x18\x06\xe0\x9e\xfb+r\x10\xaa\x07\xbbt]\xd7\xc3\xd49\x06\x82\x87\r\x0f\xeau\x84-\xac\x85\xb4)i\n\x16\xf1\xbf\x9b\xb4l\x87A*\xc2D\xc4\xf79,c\xdf\xfb5Y\xbf,t\xc1r1Z\xb5O\xa2\xd9\xe7\xe5\xc8\xfb\x01\xd4H\x93\xa4[\x8d\xd3\xb5\xfb\x1e\xc5I\x14G\x934\xa6\xa9G\xa3q\x12\xc7\x1eI~\xe20\xa7\x9aZ3E\x88\xa7\xa4\xd4C\xb9\xaf\xea\x7f\xd4\xe2d\xfe\xfd\x12VYu\xbe=\xec\x80\xa7\x93\x89k\xfe\x11\x1d\xc7\xc7\xf9\xa7Il\xe6?I\xa8\x99?=\xdf\x11\xdc\xfe\xf9\xfco\xe6f\xd4\xfeV\xb0\xba&\xfd]\xd8\x1c\xee\xc2\xa6_\x08\x7f\xd3\xbc\xdc\x1d\xcbKYj%\x85\xe0\xea\x10x\xf7\x89Q)\xa9\xf9V\xf3\x1d\xb9\xd8T]\xe1\xa1\x11b\xcd\nn\x8a\xb7$8<6\x989\xe2KYT\xaclm\xb6\xdf\xca\x99\\4:\x93\xca\x06\x9fe\xc1\x05\xc9\xd8}\xc6B\xd65\x85[Y\x043\xd3\xe6\xdc\xa5jU\xbe\xcf\xf4P\xbb\xa3\xf7E\t\xdb\xe5\xac\xaf\xcd\x0f\xf5`\xe2\x95\xab:\x97\xa5\xcdD!\r\xa93\xb8\x92\xbbFp\x9b\x1b|\x13\xdd;\xeb\xfe\xca4\x8c\xc2)\xb1\x9f\xa93\xfdXT\x82\x17\xbc\xd4\xf6\x90L)\xd6^\x06\xac\xd4\xf9u]\xb1"\xb8\x9a\xf9\x1f\xfe\xfc\xce\xff\xed+\t\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xdf\xf0\t\x87\x06!E\x00(\x00\x00'

As you can see, the outputted array is filled with ilegal signs - meaning I must be doing something wrong.

What is the most idomatic way to convert various binary files to hex / bytearray /shellcode that I can later use in my code?

Upvotes: 0

Views: 1578

Answers (1)

wovano
wovano

Reputation: 5073

The array is not filled with "illegal signs", it just contains binary data, and the string representation of binary data does not make much sense. The output you get is quite expected.

To answer your question: the variable data already is a bytes object, and does not need to be converted to another format in order to process (read) it. However, if you want, you can convert the data to a bytearray as follows:

arr = bytearray(data)

However, it will not be printed more nicely than the bytes object.

Also note that, given the '.tgz' filename, your data is probably compressed, so you'll likely need to decompress it. You can use the standard library gzip for that.

import gzip
with gzip.open('MyPlugin.tgz', 'rb') as file:
    data = bytearray(file.read())

What is the most idomatic way to convert various binary files to hex / bytearray /shellcode that I can later use in my code?

This question is difficult to answer, because we don't know what you mean with "use". Generally, hex is not a really useful format for computers, but mostly used as debug output. How you would want to convert the data really depends on the format of the data and how you want to process it.

I think that the bytes type (like the output of file.read() when file is opened in binary mode) is a good intermediate format that you could use to pass to other functions. The bytes type is a built-in type that is natively supported by Python and already used as input or output format for many existing functions. For example, it can be directly written to any binary output stream (such as a file or socket) using file.write() and it can be easily converted to for example:

Upvotes: 2

Related Questions