Reputation: 404
How create a REGEX to detect if a "String url" contains a file extension (.pdf,.jpeg,.asp,.cfm...) ?
Valids (without extensions):
Invalids (with extensions):
Thanks, Celso
Upvotes: 5
Views: 6015
Reputation: 15480
If the following code returns true, then contains a file extension in the end:
urlString.matches("\\p{Graph}+\\.\\p{Alpha}{2,4}$");
Assuming that a file extension is a dot followed by 2, 3 or 4 alphabetic chars.
Upvotes: 0
Reputation: 40168
How about this?
// assuming the file extension is either 3 or 4 characters long
public boolean hasFileExtension(String s) {
return s.matches("^[\\w\\d\\:\\/\\.]+\\.\\w{3,4}(\\?[\\w\\W]*)?$");
}
@Test
public void testHasFileExtension() {
assertTrue("3-character extension", hasFileExtension("http://www.yahoo.com/a.pdf"));
assertTrue("3-character extension", hasFileExtension("http://www.yahoo.com/a.htm"));
assertTrue("4-character extension", hasFileExtension("http://www.yahoo.com/a.html"));
assertTrue("3-character extension with param", hasFileExtension("http://www.yahoo.com/a.pdf?p=1"));
assertTrue("4-character extension with param", hasFileExtension("http://www.yahoo.com/a.html?p=1&p=2"));
assertFalse("2-character extension", hasFileExtension("http://www.yahoo.com/a.co"));
assertFalse("2-character extension with param", hasFileExtension("http://www.yahoo.com/a.co?p=1&p=2"));
assertFalse("no extension", hasFileExtension("http://www.yahoo.com/hello"));
assertFalse("no extension with param", hasFileExtension("http://www.yahoo.com/hello?p=1&p=2"));
assertFalse("no extension with param ends with .htm", hasFileExtension("http://www.yahoo.com/hello?p=1&p=a.htm"));
}
Upvotes: 3
Reputation: 199215
Alternative version without regexp but using, the URI class:
import java.net.*;
class IsFile {
public static void main( String ... args ) throws Exception {
URI u = new URI( args[0] );
for( String ext : new String[] {".png", ".pdf", ".jpg", ".html" } ) {
if( u.getPath().endsWith( ext ) ) {
System.out.println("Yeap");
break;
}
}
}
}
Works with:
java IsFile "http://download.oracle.com/javase/6/docs/api/java/net/URI.html#getPath()"
Upvotes: 3
Reputation: 34149
In Java, you are better off using String.endsWith() This is faster and easier to read. Example:
"file.jpg".endsWith(".jpg") == true
Upvotes: 3
Reputation: 4043
Not a Java developer anymore, but you could define what you're looking for with the following regex
"/\.(pdf|jpe{0,1}g|asp|docx{0,1}|xlsx{0,1}|cfm)$/i"
Not certain what the function would look like.
Upvotes: 0