Ivan
Ivan

Reputation: 20101

Match digits on a string with certain conditions in python

I have a sequence of strings in the form

s1 = "Schblaum 12324 tunguska 24 234n"
s2 = "jacarta 331 matchika 22 234k"
s3 = "3239 thingolee 80394 234k"

and I need to separate those strings in two strings, just after the number on the middle of the string, ignoring if there is a number on the first part of the string. Something like

["Schblaum 12324", "tunguska 24 234n"]
["jacarta 331", "matchika 22 234k"]
["3239 thingolee 80394", "bb 6238"]

I tried to use regex in the form

finder = re.compile(""\D(\d+)\D"")
finder.search(s1)

to no avail. Is there a way to do it, maybe without using regex? Cheers!

EDIT: just found a case where the initial string is just

"jacarta 43453"

with no other parts. This should return

["jarcata 43453"]

Upvotes: 3

Views: 105

Answers (2)

Adam Smith
Adam Smith

Reputation: 54183

Even without regex, all you're doing is looking for the number and splitting after it. Try:

s = "Schblaum 12324 tunguska 24 234n"
words = s.split()
for idx, word in enumerate(words[1:], start=1):  # skip the first element
    if word.isdigit():
        break
before, after = ' '.join(words[:idx+1]), \
                ' '.join(words[idx+1:])

You could also use re.split to find spaces that lookbehind and see a digit, but you'll have to process afterwards since it'll split after the first one as well.

import re

s3 = "3239 thingolee 80394 234k"
result = re.split(r"(?<=\d)\s", s3, 2)  # split at most twice
if len(result) > 2:
    before = ' '.join(result[:2])
else:
    before = result[0]
after = result[-1]

Upvotes: 0

Avinash Raj
Avinash Raj

Reputation: 174706

Use re.findall

>>> import re
>>> s1 = "Schblaum 12324 tunguska 24 234n"
>>> re.findall(r'^\S+\D*\d+|\S.*', s1)
['Schblaum 12324', 'tunguska 24 234n']
>>> s2 = "jacarta 331 matchika 22 234k"
>>> s3 = "3239 thingolee 80394 234k"
>>> re.findall(r'^\S+\D*\d+|\S.*', s2)
['jacarta 331', 'matchika 22 234k']
>>> re.findall(r'^\S+\D*\d+|\S.*', s3)
['3239 thingolee 80394', '234k']

Upvotes: 3

Related Questions