Tasdik Rahman
Tasdik Rahman

Reputation: 2340

Splitting strings into text and number?

Let’s say I have this string 'foo1bar2xyz'

I know the indexes for the digits in it {'1': 3, '2': 7}

I want to form substrings of the parent string which don’t have the numbers. How would I get substrings of a string removing particular indexes?

Which in the above case which would be ['foo', 'bar', 'xyz']

Have tried this so far

def iterate_string(og_string, start, stop):
    if start == 0:
        return og_string[:stop]
    else:
        return og_string[start+1:stop]

def ret_string(S):
    digit_dict = {c:i for i,c in enumerate(S) if c.isdigit()}
    digit_positions = list(digit_dict.values())
    # return digit_positions
    substrings = []
    start_index = 0
    for position in digit_positions:
        p = iterate_string(S, start_index, position)
        substrings.append(p)
        start_index = position

    return substrings


print ret_string('foo1bar2xyz')

But this returns ['foo', 'bar']

Relevant SO questions

Upvotes: 0

Views: 170

Answers (3)

Martijn Pieters
Martijn Pieters

Reputation: 1125408

If you have the indices and want to use as the input, then that's a good idea too:

def split_by_indices(s, indices):
    ends = sorted(indices.values())  # we only need the positions
    ends.append(len(s))
    substrings = []
    start = 0
    for end in ends:
        substrings.append(s[start:end])
        start = end + 1
    return substrings

Demo:

>>> split_by_indices('foo1bar2xyz', {'1': 3, '2': 7})
['foo', 'bar', 'xyz']

This ignores any actual numeric values in the input string and uses the [3, 7] positions from your dictionary only.

However, if you are currently building the {'1': 3, '2': 7} map just to split your string, it is probably easier to just use a regular expression:

import re

split_by_digits = re.compile(r'\d').split
result = split_by_digits(inputstring)

Upvotes: 2

khelili miliana
khelili miliana

Reputation: 3822

You can do it using RE

import re
h = "foo1bar2xyz"
l = re.compile("\d").split(h)

Output:

['foo', 'bar', 'xyz']

Upvotes: 4

fafl
fafl

Reputation: 7387

Try this:

l = re.compile("[0-9]").split(s)

Upvotes: 2

Related Questions