Mubashir Ali
Mubashir Ali

Reputation: 559

Regular Expressions code in java

I want to skip the url which contains any office file format or pdf at the end of url here is my code.

String Url ="http://chemistry.csudh.edu/faculty/jim/aromaticity.ppt";

        if (!Url.matches(".*(doc|dot|docx|docm|dotx|dotm)")
                || !Url.matches(".*ppt|pot|pps")
                || !Url.matches(".*xls|xlt|xlm")
                || !Url.matches(".*pdf"))
            System.out.print(Url);
        else
            System.out.print("true");

I want to know what is wrong with this code fragment as it prints url every time but i want to skip the url which contain any of above format.

Upvotes: 2

Views: 125

Answers (3)

Mureinik
Mureinik

Reputation: 311498

Your condition if faulty, since you're using || instead of &&. Consider the following - a URL that ends in .pdf cannot end in .doc, and vice versa - so the condition will always evaluate to true. Logically, you want to test that a URL does not match any document format - with infers using &&:

String Url ="http://chemistry.csudh.edu/faculty/jim/aromaticity.ppt";

if (!Url.matches(".*(doc|dot|docx|docm|dotx|dotm)")
        && !Url.matches(".*ppt|pot|pps")
        && !Url.matches(".*xls|xlt|xlm")
        && !Url.matches(".*pdf"))
    System.out.print(Url);
else {
    System.out.print("true");
}

Upvotes: 1

betteroutthanin
betteroutthanin

Reputation: 7556

It's a logic error, you should change it to

String Url ="http://chemistry.csudh.edu/faculty/jim/aromaticity.ppt";

    if (!Url.matches(".*(doc|dot|docx|docm|dotx|dotm)")
            && !Url.matches(".*ppt|pot|pps")
            && !Url.matches(".*xls|xlt|xlm")
            && !Url.matches(".*pdf"))
        System.out.print(Url);
    else
        System.out.print("true");

Upvotes: 1

Rohit Jain
Rohit Jain

Reputation: 213271

You are missing the parenthesis in the second and third regex. !Url.matches(".*ppt|pot|pps") will match all the URLs that doesn't end with ppt, but the URL like abc.pot will not be matched by that regex, and the condition will be true. you should change it to:

!Url.matches(".*(ppt|pot|pps)")

.. as in the first regex. Also, that should be && instead of || in your condition.

BTW, why do you have 4 different matches() invocation? That will have to compile 4 different regexes, while you could have done it with a single regex. just add all the extensions to the first regex list:

if (!url.matches(".*(doc|dot|docx|docm|dotx|dotm|ppt|pot|pps|xls|xlt|xlm|pdf)")

P.S: Kindly follow Java Naming Conventions. Variable names should start with lower-case alphabets.

Upvotes: 2

Related Questions