Ivan Borshchov
Ivan Borshchov

Reputation: 3555

Efficient way to swap bytes in python

I have some bytearray with length of 2*n:

a1 a2 b1 b2 c1 c2

I need to switch bytes endian in each 2-byte word, and make:

a2 a1 b2 b1 c2 c1

Now I use next approach but it is very slow for my task:

converted = bytearray([])
for i in range(int(len(chunk)/2)):
   converted += bytearray([ chunk[i*2+1], chunk[i*2] ])

Is it possible to switch endian of bytearray by calling some system/libc function?


Ok, thanks to all, I timed some suggestions:

import timeit

test = [
"""
converted = bytearray([])
for i in range(int(len(chunk)/2)):
   converted += bytearray([ chunk[i*2+1], chunk[i*2] ])
""",
"""
for i in range(0, len(chunk), 2):
    chunk[i], chunk[i+1] = chunk[i+1], chunk[i]
""",
"""
byteswapped = bytearray([0]) * len(chunk)
byteswapped[0::2] = chunk[1::2]
byteswapped[1::2] = chunk[0::2]
""",
"""
chunk[0::2], chunk[1::2] = chunk[1::2], chunk[0::2]
"""
]

for t in test:
    print(timeit.timeit(t, setup='chunk = bytearray([1]*10)'))

and result is:

$ python ti.py
11.6219761372
2.61883187294
3.47194099426
1.66421198845

So in-pace slice assignment with a step of 2 now is fastest. Also thanks to Mr. F for detailed explaining but I not yet tried it because of numpy

Upvotes: 6

Views: 21506

Answers (5)

mh001
mh001

Reputation: 51

I can repeat the result of in-place slice method from @user2357112 supports Monica. But if data size is ten time as big, it is twice slower compared with array.byteswap():

>>> timeit.timeit('chunk[0::2], chunk[1::2] = chunk[1::2], chunk[0::2]', 'chunk=bytearray(10000)')
12.54664899999625

>>> timeit.timeit('a=array.array("H",chunk);a.byteswap();chunk=a.tobytes()', 'import array;chunk=bytearray(10000)')
6.398380300001008

Upvotes: 3

Leonardo
Leonardo

Reputation: 1891

From some answers in How to byte-swap a 32-bit integer in python?

import struct

def htonl_slice(data):
    byteswapped = bytearray(4)
    byteswapped[0::4] = data[3::4]
    byteswapped[1::4] = data[2::4]
    byteswapped[2::4] = data[1::4]
    byteswapped[3::4] = data[0::4]
    return byteswapped

def htonl_slice_2(data):
    byteswapped = bytearray(len(data))
    byteswapped[0::4] = data[3::4]
    byteswapped[1::4] = data[2::4]
    byteswapped[2::4] = data[1::4]
    byteswapped[3::4] = data[0::4]
    return byteswapped

def htonl_struct(data):
    return struct.pack(">L", struct.unpack("<L", data)[0])

def swap32(data):
    return [struct.unpack("<I", struct.pack(">I", i))[0] for i in data]

def htonl_shift(x):
    return (((x << 24) & 0xFF000000) |
            ((x <<  8) & 0x00FF0000) |
            ((x >>  8) & 0x0000FF00) |
            ((x >> 24) & 0x000000FF))    

def test(self):

    data = [struct.pack('<L', i) for i in range(1000000)]

    start = time.time()
    for d in data:
        x = htonl_slice(d)
    end = time.time()
    print("htonl_slice %f" % (end - start))
    
    start = time.time()
    for d in data:
        x = htonl_struct(d)
    end = time.time()
    print("htonl_struct %f" % (end - start))

    data = [i for i in range(1000000)]
    start = time.time()
    for d in data:
        x = htonl_shift(d)
    end = time.time()
    print("htonl_shift %f" % (end - start))

    start = time.time()
    x = swap32(data)
    end = time.time()
    print("swap32 %f" % (end - start))

    data = bytearray()
    for i in range(1000000):
        data += struct.pack('<L', i)

    start = time.time()
    x = htonl_slice_2(data)
    end = time.time()
    print("htonl_slice_2 %f" % (end - start))

And the results are:

htonl_slice   3.041000
htonl_struct  0.626000
htonl_shift   0.864000
swap32        0.533000
htonl_slice_2 0.025000

I understand that htonl_shift works with ints instead of bytearrays.

It is interesting to see that swap32 is pretty fast, but for a 4 MB array htonl_slice_2 is by far the fastest.

Upvotes: 2

user2357112
user2357112

Reputation: 281177

You could use slice assignment with a step of 2:

byteswapped = bytearray(len(original))
byteswapped[0::2] = original[1::2]
byteswapped[1::2] = original[0::2]

Or if you want to do it in-place:

original[0::2], original[1::2] = original[1::2], original[0::2]

Timing shows that slicing massively outperforms a Python-level loop for large arrays:

>>> timeit.timeit('''
... for i in range(0, len(chunk), 2):
...     chunk[i], chunk[i+1] = chunk[i+1], chunk[i]''',
... 'chunk=bytearray(1000)')
81.70195105159564
>>>
>>> timeit.timeit('''
... byteswapped = bytearray(len(original))
... byteswapped[0::2] = original[1::2]
... byteswapped[1::2] = original[0::2]''',
... 'original=bytearray(1000)')
2.1136113323948393
>>>
>>> timeit.timeit('chunk[0::2], chunk[1::2] = chunk[1::2], chunk[0::2]', 'chunk=
bytearray(1000)')
1.79349659994989

For small arrays, slicing still beats the explicit loop, but the difference isn't as big:

>>> timeit.timeit('''
... for i in range(0, len(chunk), 2):
...     chunk[i], chunk[i+1] = chunk[i+1], chunk[i]''',
... 'chunk=bytearray(10)')
1.2503637694328518
>>>
>>> timeit.timeit('''
... byteswapped = bytearray(len(original))
... byteswapped[0::2] = original[1::2]
... byteswapped[1::2] = original[0::2]''',
... 'original=bytearray(10)')
0.8973060929306484
>>>
>>> timeit.timeit('chunk[0::2], chunk[1::2] = chunk[1::2], chunk[0::2]', 'chunk=
bytearray(10)')
0.6282232971918802

Upvotes: 8

ely
ely

Reputation: 77454

If you use numpy's frombuffer function, you can construct a numpy ndarray that actually shares the physical memory of the bytearray, and then swapping actions could be done in-place rather than in a copy.

You can simply perform the same indexing directly on byt, such as

byt[i] = byt[i+1]

but getting a handle to the buffered array with an explicit type in numpy often lets you do much more, particularly with bytearray which is very limited on its own.

Be careful though. Even though at the Python level, bytearray represents unsigned 8-bit integer values (0-255), the actual underlying C implementation, in bytearrayobject.h uses a plain char for the byte values (see here for more info). If using numpy for this, you probably want to specify the optional dtype argument to frombuffer, with dtype=numpy.ubyte.

import numpy as np
byt = bytearray(range(256))
npbyt = np.frombuffer(byt, dtype=np.ubyte)

for i in range(0, len(npbyt)-1, 2):
    temp = npbyt[i]
    npbyt[i] = npbyt[i+1]
    npbyt[i+1] = temp

print(list(byt))

It prints

[1, 
 0,
 3,
 2,
 5,
 4,
 ...
 255,
 254] 

I ran into some of these issues while working on a small side project called buffersort, which can perform in-place sorting on Python objects that implement the write able buffer protocol, as bytearray does.

If you're interested in grabbing the Cython source from there, there is a simple helper function _swap that would make it easy to do what you want. It's probably highly overkill for your use case though.

Upvotes: 2

SoreDakeNoKoto
SoreDakeNoKoto

Reputation: 1185

You could also swap them in place and use the original array.

chunk = bytearray([1,2,3,4])

for i in range(0, len(chunk), 2):
    chunk[i], chunk[i+1] = chunk[i+1], chunk[i]

Upvotes: 1

Related Questions