Tambet Tamm
Tambet Tamm

Reputation: 207

Python regex - match names with legal forms

I am trying to match names and legal forms. At the moment i have this:

[a-zA-Z]*\,\s*[[A-Z]\.]*

I need to match with these examples:

At the moment I am able to match only with "Bank, A." and "BANK, A.".

How to to change the regex so it matches also the following legal term abbreviations?

Upvotes: 1

Views: 92

Answers (2)

The fourth bird
The fourth bird

Reputation: 163577

You could repeat the A-Z part followed by a dot 1+ times and match an optional A-Z at the end to also match A.B

^[a-zA-Z]+,\s(?:[A-Z]\.)+[A-Z]?$
  • ^ Start of string
  • [a-zA-Z]+ Match 1+ a-zA-Z
  • ,\s Match a comma and a whitespace character
  • (?: Non capturing group
    • [A-Z]\. Match A-Z and a dot
  • )+ Close non capturing group and repeat 1+ times to match A. or A.B.
  • [A-Z]? Match optional A-Z
  • $ End of string

Regex demo

Or using a word boundary \b at the start and assert non a non whitespace (?!\S) at the end:

\b[a-zA-Z]+,\s(?:[A-Z]\.)+[A-Z]?(?!\S)

Regex demo

Upvotes: 2

Fourier
Fourier

Reputation: 2993

If this pattern is always the same you can try

\w+\,\s[A-Z\.]+

See this at work at: https://regex101.com/r/cw0KW3/1

Upvotes: 2

Related Questions