Tamjid
Tamjid

Reputation: 5566

regex to find acronyms inside parentheses

I am trying to develop a regex to match acronyms inside parentheses.

regex = r"\(\b[A-Z][a-zA-Z\.]*[A-Z]\b\.?\)"

This is what I have so far, it works for almost but not all cases. The case where an acronym is on its own line (not preceded or proceeded by any other characters) is also getting matched by this regex even if it is not surrounded by parentheses.

input it should match:
Any acronym in parentheses IE:
(ADF)

Input it shouldn't match but it is:
An acronym on it's own line IE:
ADF

Any idea what I have done wrong?

Upvotes: 1

Views: 353

Answers (1)

Mark Moretto
Mark Moretto

Reputation: 2348

here's what I have so far. I'm using an "extension notation" to capture the acronym. The pattern is basically r"\([A-Z]\w*[A-Z]\)" otherwise.

tst1="this is (ADF) a test"
tst2 = "This is is ADF a test, too."

# Newline tests
tst3 = "\n(ADF)\n"
tst4 = "\nADF\n"

# Upper/lowercase test.    
tst5 = "This is (AdF) a test."
tst6 = "This is (Adf) a test."

def retest(testcase):
    res = re.search(r"(?P<acronym>\([A-Z]\w*[A-Z]\))", testcase)
    if res:
        print(res.group("acronym"))

retest(tst1) # (ADF)
retest(tst2) # None

retest(tst3) # (ADF)
retest(tst4) # None

retest(tst5) # (ADF)
retest(tst6) # None

Upvotes: 1

Related Questions