annie
annie

Reputation: 1347

How can I remove all content in brackets except entirely numerical content?

I want to take a string and remove all occurrences of characters within square brackets:

[foo], [foo123bar], and [123bar] should be removed

But I want to keep intact any brackets consisting of only numbers:

[1] and [123] should remain

I've tried a couple of things, to no avail:

text = text.replace(/\[^[0-9+]\]/gi, "");

text = text.replace(/\[^[\d]\]/gi, "");

Upvotes: 0

Views: 516

Answers (3)

Alan Moore
Alan Moore

Reputation: 75242

The tool you're looking for is negative lookahead. Here's how you would use it:

text = text.replace(/\[(?!\d+\])[^\[\]]+\]/g, "");

After \[ locates an opening bracket, the lookahead, (?!\d+\]) asserts that the brackets do not contain only digits.

Then, [^\[\]]+ matches anything that's not square brackets, ensuring (for example) that you don't accidentally match "nested" brackets, like [[123]].

Finally, \] matches the closing bracket.

Upvotes: 2

Bart
Bart

Reputation: 1653

In python:

import re
text = '[foo] [foo123bar] [123bar] [foo123] [1] [123]'
print re.sub('(\[.*[^0-9]+\])|(\[[^0-9][^\]]*\])', '', text)

Upvotes: 0

Tamás
Tamás

Reputation: 48071

You probably need this:

text = text.replace(/\[[^\]]*[^0-9\]][^\]]*\]/gi, "");

Explanation: you want to keep those sequences within brackets that contain only numbers. An alternative way to say this is to delete those sequences that are 1) enclosed within brackets, 2) contain no closing bracket and 3) contain at least one non-numeric character. The above regex matches an opening bracket (\[), followed by an arbitrary sequence of characters except the closing bracket ([^\]], note that the closing bracket had to be escaped), then a non-numeric character (also excluding the closing bracket), then an arbitrary sequence of characters except the closing bracket, then the closing bracket.

Upvotes: 2

Related Questions