mattwilsn
mattwilsn

Reputation: 188

find all integers and sub strings between 2 characters

example_string = "name[0]['subName'][0][1]"

I would like a list of a sub strings and integers between [ and ]

the result would be

[0, 'subName', 0, 1]

I believe re.findall() would work but I'm not sure what the regex would be

Upvotes: 1

Views: 805

Answers (2)

Kijewski
Kijewski

Reputation: 26043

If you know that the indexes are either singled quoted strings or integers you can use this list comprehension:

[ g[1:-1] if i else int(g)
  for m in re.finditer(r"\[(?:(\d+)|('.*?'))\]", example_string)
  for i, g in enumerate(m.groups()) if g is not None ]

All matches m in the input:

for m in re.finditer(r"\[(?:(\d+)|('.*?'))\]", example_string) 

For every match: i=0 for the integer alternative, or i=1 for the string alternative:

for i, g in enumerate(m.groups()) if g is not None

Strip quotes or cast to an integer:

g[1:-1] if i else int(g) 

Upvotes: 4

Joran Beasley
Joran Beasley

Reputation: 114038

re_results = re.findall("\[(.*?)\]",example_string) ... might work for you (assuming you dont have nested brackets)... I suspect the key you were missing was using greedy matches "[.]" would match as big as possible, as such it captures everything from the first open bracket to the last close bracket as a single match. "[.?]" matches as little as possible, thus only one bracket pair per single match

(you will need to manually cast to integers if thats what you actually want)

you could do something simple like

def castIntOrDont(x):
    try:
       return int(x)
    except ValueError:
       return x.strip("'") #strip out the extra quotes on strings

print map(castIntOrDont,re_result)  #list(map(...)) in py3

Upvotes: 7

Related Questions