amphibient
amphibient

Reputation: 31212

Python RegEx capturing first word after pattern

Possible strings are:

  1. public class MyClass extends ParentClass {

or

  1. public class MyClass throws SomeException {

or just

  1. public class MyClass {

I am using the following pattern to always capture MyClass:

ptrn = "((public|private|protected)\s+(.*)\s*[class|interface]\s+(\w+))"

But when I do

regex = re.search(ptrn, text)

className = regex.group(4) 

for 1 and 2 I get ParentClass and SomeException respectively and only for 3 I get MyClass.

What is wrong with my regex pattern and how do I fix it?

Upvotes: 1

Views: 495

Answers (3)

nfazzio
nfazzio

Reputation: 498

This works:

strings = ("public class MyClass extends ParentClass {","public class MyClass throws SomeException {","public class MyClass {")
pattern = "((public|private|protected)\s+(class|interface)\s+(\w+))"

for string in strings:
    print re.search(pattern,string).group(4)

Upvotes: 1

Mara Ormston
Mara Ormston

Reputation: 1856

I don't know Python, but I do know regex fairly well. What you are looking for is something more like: (public|private|protected)\s+(class|interface)\s+(\w+)

I don't know which group that would be in Python, but it most other languages, it'd be group 3 (0 would be the whole string, 1 would be public, private or protected, 2 would be class or interface, 3 would be your class name.)

Upvotes: 3

Explosion Pills
Explosion Pills

Reputation: 191729

[class|interface] is a character class; essentially it will match any one of these characters. Instead, you probably want to use (class|interface)

http://rubular.com/r/Jc6o3SAhi3

Upvotes: 2

Related Questions