Reputation: 329
Is there any way to code in a pythonic way this Bash command?
hexdump -e '2/1 "%02x"' file.dat
Obviously, without using os.popen
, or any such shortcut ;)
It would be great if the code was functional in Python3.x
Upvotes: 8
Views: 49335
Reputation: 31
You can use the following snippet:
def hexdump(data: bytes):
def to_printable_ascii(byte):
return chr(byte) if 32 <= byte <= 126 else "."
offset = 0
while offset < len(data):
chunk = data[offset : offset + 16]
hex_values = " ".join(f"{byte:02x}" for byte in chunk)
ascii_values = "".join(to_printable_ascii(byte) for byte in chunk)
print(f"{offset:08x} {hex_values:<48} |{ascii_values}|")
offset += 16
Eg.
data=b''
for i in range(256):
data += i.to_bytes(1, 'big')
hexdump(data)
Will print
00000000 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f |................|
00000010 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f |................|
00000020 20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f | !"#$%&'()*+,-./|
00000030 30 31 32 33 34 35 36 37 38 39 3a 3b 3c 3d 3e 3f |0123456789:;<=>?|
00000040 40 41 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4e 4f |@ABCDEFGHIJKLMNO|
00000050 50 51 52 53 54 55 56 57 58 59 5a 5b 5c 5d 5e 5f |PQRSTUVWXYZ[\]^_|
00000060 60 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f |`abcdefghijklmno|
00000070 70 71 72 73 74 75 76 77 78 79 7a 7b 7c 7d 7e 7f |pqrstuvwxyz{|}~.|
00000080 80 81 82 83 84 85 86 87 88 89 8a 8b 8c 8d 8e 8f |................|
00000090 90 91 92 93 94 95 96 97 98 99 9a 9b 9c 9d 9e 9f |................|
000000a0 a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 aa ab ac ad ae af |................|
000000b0 b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 ba bb bc bd be bf |................|
000000c0 c0 c1 c2 c3 c4 c5 c6 c7 c8 c9 ca cb cc cd ce cf |................|
000000d0 d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 da db dc dd de df |................|
000000e0 e0 e1 e2 e3 e4 e5 e6 e7 e8 e9 ea eb ec ed ee ef |................|
000000f0 f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 fa fb fc fd fe ff |................|
Upvotes: 3
Reputation: 226171
The standard library is your friend. Try binascii.hexlify().
Upvotes: 16
Reputation: 26022
Simply read()
the whole file and encode('hex')
. What could be more pythonic?
with open('file.dat', 'rb') as f:
hex_content = f.read().encode('hex')
Upvotes: 5
Reputation: 365577
If you only care about Python 2.x, line.encode('hex')
will encode a chunk of binary data into hex. So:
with open('file.dat', 'rb') as f:
for chunk in iter(lambda: f.read(32), b''):
print chunk.encode('hex')
(IIRC, hexdump
by default prints 32 pairs of hex per line; if not, just change that 32
to 16
or whatever it is…)
If the two-argument iter
looks baffling, click the help link; it's not too complicated once you get the idea.
If you care about Python 3.x, encode
only works for codecs that convert Unicode strings to bytes; any codecs that convert the other way around (or any other combination), you have to use codecs.encode
to do it explicitly:
with open('file.dat', 'rb') as f:
for chunk in iter(lambda: f.read(32), b''):
print(codecs.encode(chunk, 'hex'))
Or it may be better to use hexlify
:
with open('file.dat', 'rb') as f:
for chunk in iter(lambda: f.read(32), b''):
print(binascii.hexlify(chunk))
If you want to do something besides print them out, rather than read the whole file into memory, you probably want to make an iterator. You could just put this in a function and change that print
to a yield
, and that function returns exactly the iterator you want. Or use a genexpr or map
call:
with open('file.dat', 'rb') as f:
chunks = iter(lambda: f.read(32), b'')
hexlines = map(binascii.hexlify, chunks)
Upvotes: 14