Probs
Probs

Reputation: 353

Split number from string in Python

If I would like to split the string from the number of the sentence: "It was amazing in 2016"

I use:

re.split('\s*((?=\d+))
out: 'It was amazing in', '2016'

Now I would like to do the opposite, so if a sentence starts with a number, then followed by a string like: '2016 was amazing'

I would like the result to be: '2016', 'was amazing'

Upvotes: 2

Views: 4615

Answers (3)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626738

Another way to easily split into digits and non-digits is to match with \d+|\D+ regex. It will yield chunks with leading/trailing whitespaces though, but they can easily be removed (or kept if that is not important):

import re
r = re.compile(r'\d+|\D+')
ss = [ 'It was amazing in 2016', '2016 was amazing']
for s in ss:
    print(r.findall(s)) # to get chunks with leading/trailing whitespace
    print([x.strip() for x in r.findall(s)]) # no  leading/trailing whitespace

See the Python demo.

Upvotes: 0

Dalvenjia
Dalvenjia

Reputation: 2033

In my opinion RegEx is an overkill for that task, so unless you already are using RegEx on your program or it's required (assignment or otherwise), I recommend some string manipulation functions to get what you want.

def ends_in_digit(my_string):
    separated = my_string.rsplit(maxsplit=1)
    return separated if separated[-1].isdigit() else False

def starts_with_digit(my_string):
    separated = my_string.split(maxsplit=1)
    return separated if separated[0].isdigit() else False

Upvotes: 0

anubhava
anubhava

Reputation: 784998

Using lookarounds you can use a single regex for both cases:

\s+(?=\d)|(?<=\d)\s+

Code:

>>> str = "It was amazing in 2016"
>>> re.split(r'\s+(?=\d)|(?<=\d)\s+', str)
['It was amazing in', '2016']

>>> str = "2016 was amazing"
>>> re.split(r'\s+(?=\d)|(?<=\d)\s+', str)
['2016', 'was amazing']

RegEx Breakup:

  • \s+ - Match 1 or more whitespaces
  • (?=\d) - Lookbehind that asserts next character is a digit
  • | - OR
  • (?<=\d) - Lookbehind that asserts previous character is a digit
  • \s+ - Match 1 or more whitespaces

Upvotes: 5

Related Questions