arun
arun

Reputation: 138

python split a string by comma not inside matrix expression

I want to split a string separated by commas not inside Matrix expression. For example:

input:

value = 'MA[1,2],MA[1,3],der(x),x,y'

expected output:

['MA[1,2]','MA[1,3]','der(x)','x','y']

I tried with value.split(','), but it splits inside [], I tried with some regular expressions to catch extract text inside [] using this regular expression

import re
re.split(r'\[(.*?)\]', value)

I am not good in regular expression,Any suggestions would be helpful

Upvotes: 2

Views: 117

Answers (1)

Sede
Sede

Reputation: 61235

You can use negative lookbehind

>>> import re
>>> value1 = 'MA[1,2],MA[1,3],der(x),x,y'
>>> value2 = 'M[a,b],x1,M[1,2],der(x),y1,y2,der(a,b)'
>>> pat = re.compile(r'(?<![[()][\d\w]),')
>>> pat.split(value1)
['MA[1,2]', 'MA[1,3]', 'der(x)', 'x', 'y']
>>> pat.split(value2)
['M[a,b]', 'x1', 'M[1,2]', 'der(x)', 'y1', 'y2', 'der(a,b)']

Demo

Explanation:

  • "(?<![[()][\d\w]),"g

    • (?<![[()][\d\w]) Negative Lookbehind - Assert that it is impossible to match the regex below

      • [[()] match a single character present in the list below [() a single character in the list [() literally
      • [\d\w] match a single character present in the list below \d match a digit [0-9] \w match any word character [a-zA-Z0-9_]

      , matches the character , literally

      g modifier: global. All matches (don't return on first match)

Upvotes: 4

Related Questions