Vivek
Vivek

Reputation: 3613

What's the best way to split a string into integer part and string part?

I am having a string like this "11547QSD". I would like to split it in to 2 parts "11547" and "QSD". I got a hint with isnumeric() function. I am placing a overview down.Please suggest me a best way to split this.

 str1 = "11547QSD"    # is a valid string (in my context)
 str2 = "ABC98765"    # is a valid string
 str3 = "111ABC111"   # is not a valid string

 if str1.isvalid():
    str1_int = str1.integer_part()
    str1_str = str1.string_part()

Thanks in advance

Upvotes: 3

Views: 174

Answers (4)

Tadeck
Tadeck

Reputation: 137360

You can use regular expressions with named groups.

You basically first create regular expressions (I created two, for both cases: digits first or letters first). Then you check if the input matches. If it does, you call groupdict() on the resulting match object to get dictionary like {'digits':'11547', 'letters':'QSD'}. Then you just use it (I printed it).

Full example following the above advice:

>>> import re
>>> checks = [
    re.compile(r'^(?P<digits>\d+)(?P<letters>\D+)$'),
    re.compile(r'^(?P<letters>\D+)(?P<digits>\d+)$'),
]
>>> inputs = ['11547QSD', 'ABC98765', '111ABC111']
>>> for item in inputs:
    for check in checks:
        if check.match(item):
            print('Digits are {digits}, letters are {letters}'.format(
                **check.search(item).groupdict()
            ))
            break
    else:
        print('%s is incorrect' % (item,))


Digits are 11547, letters are QSD
Digits are 98765, letters are ABC
111ABC111 is incorrect

Shortened version

If you understand the above, you can shorten the code and create the resulting dict (matching string - resulting groups) like that:

>>> from itertools import product
>>> {item: check.search(item).groupdict()
     for (item, check) in product(inputs, checks) if check.match(item)}
{'ABC98765': {'digits': '98765', 'letters': 'ABC'},
'11547QSD': {'digits': '11547', 'letters': 'QSD'}}

Note:

I used metacharacters \d and \D. The first basically means "digit", the second means "non-digit". The details on what they mean are here.

Upvotes: 5

FMc
FMc

Reputation: 42411

Mostly for fun:

ss  = ["11547QSD", "ABC98765", "111ABC111"]

fmt = r'\A(?P<full>{0}{1})\Z'
ps  = [r'(?P<digits>\d+)', r'(?P<letters>[A-Z]+)']

fs  = [fmt.format(*sorted(ps, reverse = b)) for b in [False, True]]
rs  = [re.compile(f) for f in fs]
ms  = filter(None, (r.search(s) for s in ss for r in rs))
gds = [m.groupdict() for m in ms]

for gd in gds:
    print gd

# Output:
# {'digits': '11547', 'full': '11547QSD', 'letters': 'QSD'}
# {'digits': '98765', 'full': 'ABC98765', 'letters': 'ABC'}

Upvotes: 0

Elazar
Elazar

Reputation: 21615

from string import ascii_letters, digits
s_int, s_str = sorted([s.strip(ascii_letters), s.strip(digits)])
is_valid = s in {s_int+s_str, s_str+s_int}

Upvotes: 0

yee
yee

Reputation: 1985

I think regex should be the best solution, an example:

import re
re.split(r'(\d+|\(|\))', '11547QSD')

Upvotes: 0

Related Questions