Regex to fetch the correct java class name

Question

For some reason, I want scan the content of java file(e.g. TagMatchingInterface.java) and fetch the class name(TagMatchingInterface) via regex, but my regex match the incorrect class name as there are some key words(class/interface/enum) hiding in the comment:

/**
 *
 * @author XXXX
 * Introduction: A common interface that judges all kinds of algorithm tags.
 * some other comment
 */
public class TagMatchingInterface 
{
  // content
  public class InnerClazz{
    // content
  }
}

here is my pattern:

public Pattern CLASS_PATTERN = Pattern.compile("(?:public\s)?(?:.*\s)?(class|interface|enum)\s+([$_a-zA-Z][$_a-zA-Z0-9]*)");
....
Matcher matcher = CLASS_PATTERN.matcher(content);
if (matcher.find()) {
   System.out.println(match.group(2));
}

Any idea about my regex?

Ro Yo Mi · Accepted Answer

Description

(?<=
|\A)(?:public\s)?(class|interface|enum)\s([^
\s]*)

Regular expression visualization

This regex does the following:

allow the string to start with public or not
be a class or interface or enum
capture the name

Note, I recommend using the global and case insensitive flags

Example

Live Example

https://regex101.com/r/vR0iK3/1

Sample Text

/**
 *
 * @author XXXX
 * Introduction: A common interface that judges all kinds of algorithm tags.
 * some other comment
 */
public class TagMatchingInterface 
{
  // content
  public class InnerClazz{
    // content
  }
}

Sample Matches

[0][0] = public class TagMatchingInterface
[0][1] = class
[0][2] = TagMatchingInterface

Capture groups:

group 0 gets the entire match
group 1 gets the class
group 2 gets the name

Explanation

NODE                     EXPLANATION
----------------------------------------------------------------------
  (?<=                     look behind to see if there is:
----------------------------------------------------------------------
    
                       '
' (newline)
----------------------------------------------------------------------
   |                        OR
----------------------------------------------------------------------
    \A                        Start of the string
----------------------------------------------------------------------
  )                        end of look-behind
----------------------------------------------------------------------
  (?:                      group, but do not capture (optional
                           (matching the most amount possible)):
----------------------------------------------------------------------
    public                   'public'
----------------------------------------------------------------------
    \s                       whitespace (
, 
, 	, \f, and " ")
----------------------------------------------------------------------
  )?                       end of grouping
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    class                    'class'
----------------------------------------------------------------------
   |                        OR
----------------------------------------------------------------------
    interface                'interface'
----------------------------------------------------------------------
   |                        OR
----------------------------------------------------------------------
    enum                     'enum'
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
  \s                       whitespace (
, 
, 	, \f, and " ")
----------------------------------------------------------------------
  (                        group and capture to \2:
----------------------------------------------------------------------
    [^
\s]*                 any character except: '
' (newline),
                             whitespace (
, 
, 	, \f, and " ") (0
                             or more times (matching the most amount
                             possible))
----------------------------------------------------------------------
  )                        end of \2
----------------------------------------------------------------------

Regex to fetch the correct java class name

Answers (2)

Description

Example

Explanation

Related Questions