aaaaa0a
aaaaa0a

Reputation: 137

How to split a strings with increced number in python

I'd like to split a string by increased number with python.

For example, I have a following string.

"1. aaa aaa aa. 2. bb bbbb bb. 3. cc cccc cc 4. ddd d dddd ... 99. z zzzz zzz"

And I want to get a following list from the above string.

[aaa aaa aa, bb bbbb bb, cc cccc cc, ddd d dddd, ... z zzzz zzz]

I tried it with following code, but I couldn't get what I wanted.

InputString = "1. aaa aaa aa. 2. bb bbbb bb. 3. cc cccc cc 4. ddd d dddd ... 99. z zzzz zzz"
OutputList = InputString.split("[1-99]. ")

Upvotes: 1

Views: 102

Answers (2)

Emma
Emma

Reputation: 27723

This expression might also work:

Test

import re

regex = r"(?<=[0-9]\.)\s*(.*?)(?=[0-9]{1,}\.|$)"
test_str = "1. aaa aaa aa. 2. bb bbbb bb. 3. cc cccc cc 4. ddd d dddd ... 99. z zzzz zzz"

print(re.findall(regex, test_str))

Output

['aaa aaa aa. ', 'bb bbbb bb. ', 'cc cccc cc ', 'ddd d dddd ... ', 'z zzzz zzz']

The expression is explained on the top right panel of regex101.com, if you wish to explore/simplify/modify it, and in this link, you can watch how it would match against some sample inputs, if you like.

Upvotes: 0

Iain Shelvington
Iain Shelvington

Reputation: 32244

You can use the re module to split your string by a regular expression

re.split(r'[0-9]+\.', input)

[0-9]+ matches 1 to many digits and \. matches the literal . character

EDIT:

You can prefix the regex with (\.\s)? to conditionally find leading periods at the end of each list of characters

re.split(r'(\.\s)?[0-9]+\.', input)

Upvotes: 4

Related Questions