Reputation: 61

Is there a zero-copy way to create a bytearray from a memoryview?

I ran into what I thought was going to be a very simple problem (and I hope it is!), which is to take raw data out of memory, and decode it to a Unicode string.

Doing this is the obvious approach, and works:

the_string = mv.tobytes().decode("utf-8")

where mv is the memoryview in question. But that defeats the purpose of zero copy, because a copy is generated by the tobytes() method. So the next thing to try was to "cast" the memoryview to a bytearray. In other words, create a bytearray that uses the memory view "mv" as its backing data. I thought that this would be simple, but I cannot figure out how to do this. Does anyone out there know how?

Upvotes: 6

Answers (2)

zap

Reputation: 31

You can recover the underlying object using memoryview.obj.

The other answer, regarding codecs.decode, is also a great answer for this specific use. Both approaches reduce the copies from 2 to 1, skipping the conversion to bytes.

If you want to go so far as having a non-owning str you may need to resort to ctypes as the type is not designed to be non-owning.

Upvotes: 0

GalaxySnail

Reputation: 27

The answer is codecs.decode in stdlib.

For example:

>>> b = "Hello 你好".encode("utf-8")
>>> b
b'Hello \xe4\xbd\xa0\xe5\xa5\xbd'

>>> m = memoryview(b)
>>> m.decode("utf-8")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'memoryview' object has no attribute 'decode'

>>> import codecs
>>> codecs.decode(m, "utf-8")
'Hello 你好'
>>> codecs.decode(m[:-3], "utf-8")
'Hello 你'

Upvotes: 0

Is there a zero-copy way to create a bytearray from a memoryview?

Answers (2)

Related Questions