David__
David__

Reputation: 375

Parsing strings and integers(w/sets) from a string to a list using python

I want to take a string that looks like this: 'Oreo.12.37-40.Apple.78' and turn it into a list that looks like this:

['Oreo', 12, 37, 38, 39, 40, 'Apple', 78]

Where the '-' causes iteration between the values. ('37-40' becomes [37,38,39,40])

https://stackoverflow.com/a/5705014/1099935 has a really nice solution, but I don't understand it enough to know how to incorporate string handling. (It works great with number only, but fails with strings in the int())

comments are locked for me(?), so here is an additional comment: I need the list to contain either int values or strings. Then using Q objects in the filters items can be filter by common name or the assigned product key (generally short and also used commonly)

Upvotes: 0

Views: 135

Answers (2)

the wolf
the wolf

Reputation: 35532

This does it:

#!/usr/bin/python

import re

s = 'Oreo.12.37-40.Apple.78'
l=[]
for e in re.split('[^\w-]+',s):
    m=re.match('(\d+)-(\d+)',e)
    if m:
       x=int(m.group(1))
       y=int(m.group(2))
       for i in range(x,y+1):
          l.append(i)   
    else:       
       try:
          l.append(int(e))
       except ValueError:
          l.append(e)

print l  

Output:

['Oreo', 12, 37, 38, 39, 40, 'Apple', 78]

Upvotes: 1

Andrew Clark
Andrew Clark

Reputation: 208485

Here is an option:

def custom_split(s):
    def int_range_expand(s):
        try:
            return [int(s)]
        except ValueError:
            try:
                start, end = map(int, s.split('-'))
                return range(start, end+1)
            except Exception:
                pass
        return [s]
    return sum(map(int_range_expand, s.split('.')), [])

>>> custom_split('Oreo.12.37-40.Apple.78')
['Oreo', 12, 37, 38, 39, 40, 'Apple', 78]

This uses an EAFP approach, with the steps broken down below:

1. custom_split('Oreo.12.37-40.Apple.78')
2. s <= 'Oreo.12.37-40.Apple.78'
3. s.split('.') => ['Oreo', '12', '37-40', 'Apple', '78']
4. map(int_range_expand, ['Oreo', '12', '37-40', 'Apple', '78'])
       => [['Oreo'], [12], [37, 38, 39, 40], ['Apple'], [78]]
5. sum([['Oreo'], [12], [37, 38, 39, 40], ['Apple'], [78]], [])
       => ['Oreo', 12, 37, 38, 39, 40, 'Apple', 78]

The int_range_expand() function from step 4 always returns a list. If the argument is a string or an int the result will only have one element, but if it is a range like 37-40 then it will contain each integer in that range. This allows us to chain all of the resulting lists into a single list easily.

Step 5 is similar to itertools.chain, which is more efficient but requires importing a module, up to you which is preferable.

Upvotes: 1

Related Questions