Flavius
Flavius

Reputation: 13816

split byte string into lines

How can I split a byte string into a list of lines?

In python 2 I had:

rest = "some\nlines"
for line in rest.split("\n"):
    print line

The code above is simplified for the sake of brevity, but now after some regex processing, I have a byte array in rest and I need to iterate the lines.

Upvotes: 72

Views: 123726

Answers (3)

Janus Troelsen
Janus Troelsen

Reputation: 21300

There is no reason to convert to string. Just give split bytes parameters. Split strings with strings, bytes with bytes.

>>> a = b'asdf\nasdf'
>>> a.split(b'\n')
[b'asdf', b'asdf']

Also, since you're splitting on newlines, you could slightly simplify that by using splitlines() (available for both str and bytes):

>>> a = b'asdf\nasdf'
>>> a.splitlines()
[b'asdf', b'asdf']

Upvotes: 132

warvariuc
warvariuc

Reputation: 59604

Decode the bytes into unicode (str) and then use str.split:

Python 3.2.3 (default, Oct 19 2012, 19:53:16) 
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> a = b'asdf\nasdf'
>>> a.split('\n')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Type str doesn't support the buffer API
>>> a = a.decode()
>>> a.split('\n')
['asdf', 'asdf']
>>> 

You can also split by b'\n', but I guess you have to work with strings not bytes anyway. So convert all your input data to str as soon as possible and work only with unicode in your code and convert it to bytes when needed for output as late as possible.

Upvotes: 25

namit
namit

Reputation: 6957

try this.. .

rest = b"some\nlines"
rest=rest.decode("utf-8")

then you can do rest.split("\n")

Upvotes: 10

Related Questions