Kawu
Kawu

Reputation: 14003

Python regex for Java package names

I have problems determining valid Java package names using Python. Here's the code:

    packageName = "com.domain.lala" # valid, not rejected -> correct
    #packageName = ".com.domain.lala" # invalid, rejected -> correct
    #packageName = "com..domain.lala" # invalid, not rejected -> incorrect
    #packageName = "com.domain.lala." # invalid, not rejected -> incorrect

    matchObject = re.match("([a-z_]{1}[a-z0-9_]*(\.[a-z_]{1}[a-z0-9_]*)*)",
                           packageName)

    if matchObject is not None:
        print packageName + " is a package name!"
    else:
        print packageName + " is *not* a package name!"
        Utilities.show_error("Invalid Package Name", "Invalid package name " + packageName + "!", "Ok", "", "")

Package names must start with a lowercase letter or underscore and each dot must be followed by at least one lowercase letter or underscore again. All other characters can be lowercase letters, digits, or an underscore. No runs of dots are allowed and it may not end with or start with a dot.

How do I solve this?

Upvotes: 6

Views: 5248

Answers (5)

rishabhmhjn
rishabhmhjn

Reputation: 1079

The following pattern worked well for me:

/^[a-z][a-z0-9_]*(\.[a-z0-9_]+)+[0-9a-z_]$/i;

The results can be found in this gist.

[✔] me.unfollowers.droid
[✔] me_.unfollowers.droid
[✔] me._unfollowers.droid
[✔] me.unfo11llowers.droid
[✔] me11.unfollowers.droid
[✔] m11e.unfollowers.droid
[✗] 1me.unfollowers.droid
[✔] me.unfollowers23.droid
[✔] me.unfollowers.droid23d
[✔] me.unfollowers_.droid
[✔] me.unfollowers._droid
[✔] me.unfollowers_._droid
[✔] me.unfollowers.droid_
[✔] me.unfollowers.droid32
[✗] me.unfollowers.droid/
[✗] me:.unfollowers.droid
[✗] :me.unfollowers.droid
[✗] me.unfollowers.dro;id
[✗] me.unfollowe^rs.droid
[✗] me.unfollowers.droid.
[✗] me.unfollowers..droid
[✗] me.unfollowers.droid._
[✔] me.unfollowers.11212
[✔] me.1.unfollowers.11212
[✗] me..unfollowers.11212
[✗] abc
[✗] abc.
[✗] .abc

Upvotes: 2

Guillaume Perrot
Guillaume Perrot

Reputation: 4308

Upper case letters are in fact allowed in Java package names. They are just discouraged but it works.

The regex should be:

^([a-zA-Z_]{1}[a-zA-Z0-9_]*(\\.[a-zA-Z_]{1}[a-zA-Z0-9_]*)*)?$

Upvotes: 3

Gopi
Gopi

Reputation: 10293

You need to put the start of line and end of line markers. So the regex should look like -

^([a-z_]{1}[a-z0-9_]*(\.[a-z_]{1}[a-z0-9_]*)*)$

Upvotes: 2

Humphrey Bogart
Humphrey Bogart

Reputation: 7613

You can parse the string instead:

def valid_java_package_name(string):
    tree = string.split('.')

    if len(tree) == 0:
        return false

    for node in tree:
        if not valid_java_package_node(node):
            return false

    return true

Upvotes: 0

interjay
interjay

Reputation: 110146

Add $ at the end of the regex to force matching the full string. Right now it's matching only a partial string, so it's incorrectly accepting valid package names that have garbage added at the end.

Upvotes: 4

Related Questions