dhyams
dhyams

Reputation: 61

Is there a zero-copy way to create a bytearray from a memoryview?

I ran into what I thought was going to be a very simple problem (and I hope it is!), which is to take raw data out of memory, and decode it to a Unicode string.

Doing this is the obvious approach, and works:

the_string = mv.tobytes().decode("utf-8")

where mv is the memoryview in question. But that defeats the purpose of zero copy, because a copy is generated by the tobytes() method. So the next thing to try was to "cast" the memoryview to a bytearray. In other words, create a bytearray that uses the memory view "mv" as its backing data. I thought that this would be simple, but I cannot figure out how to do this. Does anyone out there know how?

Upvotes: 6

Views: 1123

Answers (1)

GalaxySnail
GalaxySnail

Reputation: 17

The answer is codecs.decode in stdlib.

For example:

>>> b = "Hello 你好".encode("utf-8")
>>> b
b'Hello \xe4\xbd\xa0\xe5\xa5\xbd'

>>> m = memoryview(b)
>>> m.decode("utf-8")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'memoryview' object has no attribute 'decode'

>>> import codecs
>>> codecs.decode(m, "utf-8")
'Hello 你好'
>>> codecs.decode(m[:-3], "utf-8")
'Hello 你'

Upvotes: -1

Related Questions