user925567
user925567

Reputation: 69

Using Python to break a continuous string into components?

This is similar to what I want to do: breaking a 32-bit number into individual fields

This is my typical "string" 00000000110000000000011000000000

I need to break it up into four equal parts:

00000000

11000000

00000110

00000000

I need to append the list to a new text file with the original string as a header.

I know how to split the string if there were separators such as spaces but my string is continuous.

These could be thought of as 32bit and 8bit binary numbers but they are just text in a text file (for now)!

I am brand new to programing in Python so please, I need patient details, no generalizations.

Do not assume I know anything.

Thank you,

Ralph

Upvotes: 6

Views: 5160

Answers (5)

Remi
Remi

Reputation: 21175

+1 for Robert's answer. As for 'I need to append the list to a new text file with the original string as a header':

s = "00000000110000000000011000000000"
s += '\n' + '\n'.join(s[i:i+8] for i in xrange(0, len(s), 8))

will give

'00000000110000000000011000000000\n00000000\n11000000\n00000110\n00000000'

thus putting each 'byte' on a separate line as I understood from your question...

Edit: some notes to help you understand: A list [] (see here) contains your data, in this case, strings, between its brackets. The first item in a list is retrieved as in:

mylist[0]

in Python, a string is itself also an object, with specific methods that you can call. So '\n' (representing a carriage return) is an object of type 'string', and you can call it's method join() with your list as argument:

'\n'.join(mylist)

The elements in the list are then 'joined' together with the string '\n' in between each element. The result is no longer a list, but a string. Two strings can be added together, thus

s += '\n' + '\n'.join(mylist)

adds to s (which was already a string), the right part which is itself a 'sum' of strings. (I hope that clears some things up?)

Upvotes: 3

Andrew Clark
Andrew Clark

Reputation: 208485

For reference, here are a few alternatives for splitting strings into equal length parts:

>>> import re
>>> re.findall(r'.{1,8}', s, re.S)
['00000000', '11000000', '00000110', '00000000']

>>> map(''.join, zip(*[iter(s)]*8))
['00000000', '11000000', '00000110', '00000000']

The zip method for splitting a sequence into n-length groups is documented here, but it will only work for strings whose length is evenly divisible by n (which won't be an issue for this particular question). If the string length is not evenly divisible by n you could use itertools.izip_longest(*[iter(s)]*8, fillvalue='').

Upvotes: 3

KevinDTimm
KevinDTimm

Reputation: 14376

you need a substring

x = 01234567
x0 = x[0:2]
x1 = x[2:4]
x2 = x[4:6]
x3 = x[6:8]

So, x0 will hold '01', x1 will hold '23', etc.

Upvotes: 0

immortal
immortal

Reputation: 3188

Strings, Lists and Touples can be broken using the indexing operator []. Using the : operator inside of the indexing operator you can achieve fields there. Try something like:

x = "00000000110000000000011000000000"
part1, part2, part3, part4 = x[:8], x[8:16], x[16:24], x[24:]

Upvotes: 1

robert
robert

Reputation: 34398

This should do what you want. See comprehensions for more details.

>>> s = "00000000110000000000011000000000"
>>> [s[i:i+8] for i in xrange(0, len(s), 8)]
['00000000', '11000000', '00000110', '00000000']

Upvotes: 10

Related Questions