Reputation: 3613
I am having a string like this "11547QSD". I would like to split it in to 2 parts "11547" and "QSD". I got a hint with isnumeric() function. I am placing a overview down.Please suggest me a best way to split this.
str1 = "11547QSD" # is a valid string (in my context)
str2 = "ABC98765" # is a valid string
str3 = "111ABC111" # is not a valid string
if str1.isvalid():
str1_int = str1.integer_part()
str1_str = str1.string_part()
Thanks in advance
Upvotes: 3
Views: 174
Reputation: 137360
You can use regular expressions with named groups.
You basically first create regular expressions (I created two, for both cases: digits first or letters first). Then you check if the input matches. If it does, you call groupdict()
on the resulting match object to get dictionary like {'digits':'11547', 'letters':'QSD'}
. Then you just use it (I printed it).
Full example following the above advice:
>>> import re
>>> checks = [
re.compile(r'^(?P<digits>\d+)(?P<letters>\D+)$'),
re.compile(r'^(?P<letters>\D+)(?P<digits>\d+)$'),
]
>>> inputs = ['11547QSD', 'ABC98765', '111ABC111']
>>> for item in inputs:
for check in checks:
if check.match(item):
print('Digits are {digits}, letters are {letters}'.format(
**check.search(item).groupdict()
))
break
else:
print('%s is incorrect' % (item,))
Digits are 11547, letters are QSD
Digits are 98765, letters are ABC
111ABC111 is incorrect
If you understand the above, you can shorten the code and create the resulting dict (matching string - resulting groups) like that:
>>> from itertools import product
>>> {item: check.search(item).groupdict()
for (item, check) in product(inputs, checks) if check.match(item)}
{'ABC98765': {'digits': '98765', 'letters': 'ABC'},
'11547QSD': {'digits': '11547', 'letters': 'QSD'}}
Note:
I used metacharacters \d
and \D
. The first basically means "digit", the second means "non-digit". The details on what they mean are here.
Upvotes: 5
Reputation: 42411
Mostly for fun:
ss = ["11547QSD", "ABC98765", "111ABC111"]
fmt = r'\A(?P<full>{0}{1})\Z'
ps = [r'(?P<digits>\d+)', r'(?P<letters>[A-Z]+)']
fs = [fmt.format(*sorted(ps, reverse = b)) for b in [False, True]]
rs = [re.compile(f) for f in fs]
ms = filter(None, (r.search(s) for s in ss for r in rs))
gds = [m.groupdict() for m in ms]
for gd in gds:
print gd
# Output:
# {'digits': '11547', 'full': '11547QSD', 'letters': 'QSD'}
# {'digits': '98765', 'full': 'ABC98765', 'letters': 'ABC'}
Upvotes: 0
Reputation: 21615
from string import ascii_letters, digits
s_int, s_str = sorted([s.strip(ascii_letters), s.strip(digits)])
is_valid = s in {s_int+s_str, s_str+s_int}
Upvotes: 0
Reputation: 1985
I think regex should be the best solution, an example:
import re
re.split(r'(\d+|\(|\))', '11547QSD')
Upvotes: 0