vds5344
vds5344

Reputation: 41

Regex to select every semicolon except the ones enclosed within Square brackets []

I want to split a string based upon semicolon except the semicolons inside a square bracket!

string="'[Forsyth, Jennifer K.; Asarnow, Robert F.] Univ Calif Los Angeles, Dept Psychol, Los Angeles, CA 90095 USA; [Bachman, Peter] Univ Pittsburgh, Dept Psychiat, Pittsburgh, PA 15213 USA; [Mathalon, Daniel H.] Univ Calif San Francisco, Dept Psychiat, San Francisco, CA 94143 USA; [Mathalon, Daniel H.; Roach, Brian J.] San Francisco VA Med Ctr, San Francisco, CA 94121 USA; [Asarnow, Robert F.] Univ Calif Los Angeles, Dept Psychiat & Biobehav Sci, Los Angeles, CA 90095 USA'"

when I used

strung=filter(None, re.split("[;]", string))

the output was

["'[Forsyth, Jennifer K.",

 ' Asarnow, Robert F.] Univ Calif Los Angeles, Dept Psychol, Los Angeles, CA 90095 USA',

 ' [Bachman, Peter] Univ Pittsburgh, Dept Psychiat, Pittsburgh, PA 15213 USA',

This removed all the semicolon even within the square brackets. How do I maintain the square brackets and the semicolons within them and split on the base of all other semicolons.

Upvotes: 1

Views: 86

Answers (2)

n1c9
n1c9

Reputation: 2687

Brackets have a different meaning in regular expressions - usually they are used to match a single character of a list of characters. Regardless, what you want is actually this:

\[;\]

This escapes the brackets in the regex.

Upvotes: 2

anubhava
anubhava

Reputation: 785256

You can use a negative lookahead based regex for splitting:

strung = filter(None, re.split(r';(?![^\[\]]*\])', string))

(?![^\[\]]*\]) is the negative lookahead to assert that ; is not within [...].

RegEx Demo

Output"

'[Forsyth, Jennifer K.; Asarnow, Robert F.] Univ Calif Los Angeles, Dept Psychol, Los Angeles, CA 90095 USA
[Bachman, Peter] Univ Pittsburgh, Dept Psychiat, Pittsburgh, PA 15213 USA
[Mathalon, Daniel H.] Univ Calif San Francisco, Dept Psychiat, San Francisco, CA 94143 USA
[Mathalon, Daniel H.; Roach, Brian J.] San Francisco VA Med Ctr, San Francisco, CA 94121 USA
[Asarnow, Robert F.] Univ Calif Los Angeles, Dept Psychiat & Biobehav Sci, Los Angeles, CA 90095 USA'

Upvotes: 4

Related Questions