user2444327
user2444327

Reputation: 83

Extracting java main class name in python

I have string in a python script which contains some java code.
How can I extract base java class name from it in order to execute it using subprocess?
I think it can achieved using regex, but I don't know how.

Sample:

a = """
import java.util.Scanner;
class sample{}
class second
{
    static boolean check_prime(int a)
    {
        int c=0;
        for (int i=1;i<=a; i++) {
            if(a%i==0)
                c++;
        }
        if(c == 2)
            return true;
        else
            return false;
    }
    public static void main(String[] args) {
        Scanner in = new Scanner(System.in);
        System.out.println("Enter two numbers");
        int a = in.nextInt();
        int b = in.nextInt();
        if(check_prime(a) && check_prime(b))
        {
            if(b-a==2 || a-b==2)
                System.out.println("They are twin primes");
            else
                System.out.println("They are not twin primes");
        }
        else
            System.out.println("They might not be prime numbers");
    }
}
"""

Upvotes: 2

Views: 2609

Answers (5)

Remi Guan
Remi Guan

Reputation: 22292

As I said in comment, use re.findall() like this:

re.findall('class (\w*)', a)

As the function name, findall() can find all of the class names. And use \w here will match all ascii letters(will be better than .* if you're using class MyClass{).


About find the main class, use re.S like this:

for i in re.split('\nclass ', a)[1:]:                      # will match the main code block and the class name of all classes
    if re.search('\n\s*public static void main', i):              # check if 'public static void main' in a class
        print(re.search('(\w*)', i).group(1))       # and print out the class name

A more simple way, only one line use list comprehension:

[re.search('(\w*)', i).group(1) for i in re.split('\nclass ', a) if re.search('\n\s*public static void main', i)]

Upvotes: 0

spalac24
spalac24

Reputation: 1116

Using only regex is hardly ever going to work. As a basic example of why it could not, consider this:

public class A {
     public static void ImDoingThisToMessYouUp () {
          String s = "public static void main (String[] args) {}";
     }
}

public class B {
      public static void main (String[] args) {}
}

You get the idea... Regex could always be fooled into believing they found something which isn't really what you are looking for. You must rely on more advanced libraries for parsing.

I'd go with J.F. Sebastian's answer.

Upvotes: 1

Brent C
Brent C

Reputation: 833

An approximate solution to the problem is possible with regular expressions, as you guessed. However, there are a few tricks to keep in mind:

  1. A class name may not terminate with whitespace, since MyClass{ is legal and common
  2. A type parameter can be provided after the classname such as MyClass<T> and the compiled .class file's name will not be effected by this type parameter
  3. A file may have more than one top level class, however one must not be declared public and this class cannot have the same name as the file
  4. The public class that has the same name as the file may have inner class (which may even be public) but these must necessarily come after the outer class declaration.

These tips lead to searching for the first occurrence of the phrase public class, capturing the next run of characters, then looking for whitespace, a { or < character.

This is what I came up with (may be a bit ugly): public\s*(?:abstract?)?\s*(?:static?)?\s*(?:final?)?\s*(?:strictfp?)?\s*class\s*(\w.*)\s*,?<.*$

Upvotes: 0

jfs
jfs

Reputation: 414585

A main class is a class which contains the public static void main function.

If it is possible in your environment; you could use a library that can parse Java source code such as plyj or javalang:

#!/usr/bin/env python
import javalang # $ pip install javalang

tree = javalang.parse.parse(java_source)
name = next(klass.name for klass in tree.types
            if isinstance(klass, javalang.tree.ClassDeclaration)
            for m in klass.methods
            if m.name == 'main' and m.modifiers.issuperset({'public', 'static'}))
# -> 'second'

If there is a package declaration e.g., package your_package; at the top of the Java source i.e., if the full class name is your_package.second then you could get the package name as tree.package.name.

Or you could use a parser generator such as grako and specify a Java grammar subset that is enough to get the class name in your case. If the input is highly regular; you could try a regex and expect it to fail if your assumptions about the structure of the code are wrong.

Upvotes: 2

ergonaut
ergonaut

Reputation: 7057

Here's a crude way:

import re

b = a.split()
str = b[b.index("class")+1]
javaclass = re.sub("{.*$","",str)
print (javaclass)

...which essentially takes all the words, and find the first word after the first occurrence of "class". It also removes "{" and anything after it, if you have a situation like

class MyClass{

However you would need to do a lot more if you have multiple classes in a file.

Upvotes: 0

Related Questions