kshnkvn
kshnkvn

Reputation: 966

Split string by chars in python with re

I have a set of strings in which numbers can be separated by different characters, or letters:

12
14:45:09
2;32
04:43
434.34
43M 343ho

I want to get a list of these numbers for each row:

[12]
[14, 45, 9]
[2, 32]
[4, 43]
[434, 34]
[43, 343]

I try to do so, but this does not work:

>>> import re
>>> pattern = r'(\d*)'
>>> re.split(pattern, '12')
['', '12', '', '', '']
>>> re.split(pattern, '14:45:09')
['', '14', '', '', ':', '45', '', '', ':', '09', '', '', '']
>>> pattern = r'([0-9]*)'
>>> re.split(pattern, '14:45:09')
['', '14', '', '', ':', '45', '', '', ':', '09', '', '', '']
>>> re.split(pattern, '43M 343ho')
['', '43', '', '', 'M', '', ' ', '343', '', '', 'h', '', 'o', '', '']
>>>

How can this be done correctly?

Upvotes: 0

Views: 162

Answers (3)

Marek R
Marek R

Reputation: 38219

from sys import stdin
import re

for line in stdin:
    result  = [int(x) for x in re.split(r'\D+',line) if x]
    print(result)

https://ideone.com/izR1BV

or

    result  = [int(x) for x in re.findall(r'\d+',line)]

https://ideone.com/NQzQ72

Upvotes: 1

The fourth bird
The fourth bird

Reputation: 163642

Instead of split you might use re.findall matching 0+ times a zero and capture 1+ digits

0*(\d+)

Regex demo | Python demo

For example

import re

regex = r"0*(\d+)"

strings = [
    "12",
    "14:45:09",
    "2;32",
    "04:43",
    "434.34",
    "43M 343ho"
]

for s in strings:
    print(re.findall(regex, s))

Output

['12']
['14', '45', '9']
['2', '32']
['4', '43']
['434', '34']
['43', '343']

Upvotes: 3

Brian Makin
Brian Makin

Reputation: 907

With string split:

"14:45:09".split(':') The argument to split is the character on which to split.

With re: re.split(r':', "14:45:09")

Upvotes: 0

Related Questions