Reputation: 875

Splitting a string into a list (but not separating adjacent numbers) in Python

For example, I have:

string = "123ab4 5"

I want to be able to get the following list:

["123","ab","4","5"]

rather than list(string) giving me:

["1","2","3","a","b","4"," ","5"]

Upvotes: 4

Answers (5)

John Kugelman

Reputation: 362197

Find one or more adjacent digits (\d+), or if that fails find non-digit, non-space characters ([^\d\s]+).

>>> string = '123ab4 5'
>>> import re
>>> re.findall('\d+|[^\d\s]+', string)
['123', 'ab', '4', '5']

If you don't want the letters joined together, try this:

>>> re.findall('\d+|\S', string)
['123', 'a', 'b', '4', '5']

Upvotes: 8

Inbar Rose

Reputation: 43517

you can do a few things here, you can

1. iterate the list and make groups of numbers as you go, appending them to your results list.

not a great solution.

2. use regular expressions.

implementation of 2:

>>> import re
>>> s = "123ab4 5"
>>> re.findall('\d+|[^\d]', s)
['123', 'a', 'b', '4', ' ', '5']

you want to grab any group which is at least 1 number \d+ or any other character.

edit

John beat me to the correct solution first. and its a wonderful solution.

i will leave this here though because someone else might misunderstand the question and look for an answer to what i thought was written also. i was under the impression the OP wanted to capture only groups of numbers, and leave everything else individual.

Upvotes: 0

John Gaines Jr.

Reputation: 11554

This will give the split you want:

re.findall(r'\d+|[a-zA-Z]+', "123ab4 5")

['123', 'ab', '4', '5']

Upvotes: 0

Jon Clements

Reputation: 142256

You could do:

>>> [el for el in re.split('(\d+)', string) if el.strip()]
['123', 'ab', '4', '5']

Upvotes: 1

RocketDonkey

Reputation: 37279

The other solutions are definitely easier. If you want something far less straightforward, you could try something like this:

>>> import string
>>> from itertools import groupby
>>> s = "123ab4 5"
>>> result = [''.join(list(v)) for _, v in groupby(s, key=lambda x: x.isdigit())]
>>> result = [x for x in result if x not in string.whitespace]
>>> result
['123', 'ab', '4', '5']

Upvotes: 1

Splitting a string into a list (but not separating adjacent numbers) in Python

Answers (5)

Related Questions