celsowm
celsowm

Reputation: 404

Java Regex: How detect a URL with file extension

How create a REGEX to detect if a "String url" contains a file extension (.pdf,.jpeg,.asp,.cfm...) ?

Valids (without extensions):

Invalids (with extensions):

Thanks, Celso

Upvotes: 5

Views: 6015

Answers (5)

elias
elias

Reputation: 15480

If the following code returns true, then contains a file extension in the end:

urlString.matches("\\p{Graph}+\\.\\p{Alpha}{2,4}$");

Assuming that a file extension is a dot followed by 2, 3 or 4 alphabetic chars.

Upvotes: 0

limc
limc

Reputation: 40168

How about this?

// assuming the file extension is either 3 or 4 characters long
public boolean hasFileExtension(String s) {
    return s.matches("^[\\w\\d\\:\\/\\.]+\\.\\w{3,4}(\\?[\\w\\W]*)?$");
}

@Test
public void testHasFileExtension() {
    assertTrue("3-character extension", hasFileExtension("http://www.yahoo.com/a.pdf"));
    assertTrue("3-character extension", hasFileExtension("http://www.yahoo.com/a.htm"));
    assertTrue("4-character extension", hasFileExtension("http://www.yahoo.com/a.html"));
    assertTrue("3-character extension with param", hasFileExtension("http://www.yahoo.com/a.pdf?p=1"));
    assertTrue("4-character extension with param", hasFileExtension("http://www.yahoo.com/a.html?p=1&p=2"));

    assertFalse("2-character extension", hasFileExtension("http://www.yahoo.com/a.co"));
    assertFalse("2-character extension with param", hasFileExtension("http://www.yahoo.com/a.co?p=1&p=2"));
    assertFalse("no extension", hasFileExtension("http://www.yahoo.com/hello"));
    assertFalse("no extension with param", hasFileExtension("http://www.yahoo.com/hello?p=1&p=2"));
    assertFalse("no extension with param ends with .htm", hasFileExtension("http://www.yahoo.com/hello?p=1&p=a.htm"));
}

Upvotes: 3

OscarRyz
OscarRyz

Reputation: 199215

Alternative version without regexp but using, the URI class:

import java.net.*;

class IsFile { 
  public static void main( String ... args ) throws Exception { 
    URI u = new URI( args[0] );
    for( String ext : new String[] {".png", ".pdf", ".jpg", ".html"  } ) { 
      if( u.getPath().endsWith( ext ) ) { 
        System.out.println("Yeap");
        break;
      }
    }
  }
}

Works with:

java IsFile "http://download.oracle.com/javase/6/docs/api/java/net/URI.html#getPath()"

Upvotes: 3

Amir Raminfar
Amir Raminfar

Reputation: 34149

In Java, you are better off using String.endsWith() This is faster and easier to read. Example:

"file.jpg".endsWith(".jpg") == true

Upvotes: 3

Duniyadnd
Duniyadnd

Reputation: 4043

Not a Java developer anymore, but you could define what you're looking for with the following regex

"/\.(pdf|jpe{0,1}g|asp|docx{0,1}|xlsx{0,1}|cfm)$/i"

Not certain what the function would look like.

Upvotes: 0

Related Questions