Reputation: 5928
When a Python bytearray is created (with an integer passed to it) it creates a bytearray of that many bytes, and sets them all to zero.
I want to clear the bytearray, and it could be quite large, iterating over it and setting the contents to zero that way is pretty poor.
Is there a better way?
(memoryviews and bytearrays are poorly documented IMO)
Best resources so far (but none of them answer my question)
http://docs.python.org/dev/library/stdtypes.html#bytes-methods
http://docs.python.org/dev/library/functions.html#bytearray
Upvotes: 2
Views: 5561
Reputation: 29571
Here are a few different ways of clearing a bytearray without changing the reference (in case other object refer to it):
Using clear():
>>> a=bytearray(10)
>>> a
bytearray(b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00')
>>> a.clear()
>>> a
bytearray(b'')
Using slicing:
>>> a=bytearray(10)
>>> a[0:10] = []
>>> a
bytearray(b'')
>>> a=bytearray(10)
>>> del a[0:10]
>>> a
bytearray(b'')
Using del:
>>> a=bytearray(10)
>>> b=a
>>> del a[0:10]
>>> a
bytearray(b'')
You can verify that if another variable, say b
, references a
, none of the above technique will break this. The following technique of resetting a
, by creating a new bytearray, will break this:
>>> a=bytearray(10)
>>> b=a
>>> b is a
True
>>> a=bytearray(10)
>>> b is a
False
However, all the above change the array size to 0. Perhaps you want to simply 0 all the items, keeping the size unchanged, and keeping any references valid:
>>> a=bytearray(10)
>>> b=a
>>> b is a
True
>>> a[0:10]=bytearray(10)
>>> b is a
True
So you can easily, with this technique, 0 any subsection of the array (in fact, of any mutable container).
Upvotes: 2
Reputation: 880399
Edit: This answer is wrong. s = s.translate('\0'*256)
is slower than s = bytearray(256)
, so there is no point in using translate
here. @gnibbler provides a better solution.
Bytearrays have many of the same methods that strings have. You could use the translate method:
In [64]: s = bytearray('Hello World')
In [65]: s
Out[65]: bytearray(b'Hello World')
In [66]: import string
In [67]: zero = string.maketrans(buffer(bytearray(range(256))),buffer(bytearray(256)))
In [68]: s.translate(zero)
Out[68]: bytearray(b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00')
By the way, Dave Beazley has written a very useful introduction to bytearrays.
Or, slightly modifying millimoose's answer:
In [72]: s.translate('\0'*256)
Out[72]: bytearray(b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00')
In [73]: %timeit s.translate('\0'*256)
1000000 loops, best of 3: 282 ns per loop
In [74]: %timeit s.translate(bytearray(256))
1000000 loops, best of 3: 398 ns per loop
Upvotes: 2
Reputation: 304375
Why would you assume reallocating the bytearray is so slow? It's more than 10 times faster than using translate
or large bytearrays!
I'm deleting the original bytearray so you don't have to worry about temporarily using double the memory
# For small bytearray reallocation is a tiny bit faster
$ python -m timeit -s "s=bytearray('Hello World')" "s.translate('\0'*256)"
1000000 loops, best of 3: 0.672 usec per loop
$ python -m timeit -s "s=bytearray('Hello World')" "lens=len(s);del s;s=bytearray(lens)"
1000000 loops, best of 3: 0.522 usec per loop
# For large bytearray reallocation is much faster
$ python -m timeit -s "s=bytearray('Hello World'*10000)" "s.translate('\0'*256)"
1000 loops, best of 3: 225 usec per loop
$ python -m timeit -s "s=bytearray('Hello World'*10000)" "lens=len(s);del s;s=bytearray(lens)"
10000 loops, best of 3: 18.5 usec per loop
There's an even better way that allow s
to keep the same reference. You simply need to call the __init__
method on the instance.
>>> s=bytearray(b"hello world")
>>> id(s)
3074325152L
>>> s.__init__(len(s))
>>> s
bytearray(b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00')
>>> id(s)
3074325152L
Testing the timing
$ python -m timeit -s "s=bytearray('Hello World'*10000)" "s.__init__(len(s))"
100000 loops, best of 3: 18.7 usec per loop
I ran these gigabyte tests on a different computer with more RAM
$ python -m timeit -s "s=bytearray('HelloWorld'*100000000)" "s.__init__(len(s))"
10 loops, best of 3: 454 msec per loop
$ python -m timeit -s "s=bytearray('HelloWorld'*100000000)" "s.translate('\0'*256)"
10 loops, best of 3: 1.43 sec per loop
Upvotes: 6
Reputation: 8685
All you need to do is re declare your bytearray
b = bytearray(LEN_OF_BYTE_ARRAY)
Upvotes: 0