Sziro
Sziro

Reputation: 1303

How to get (split) filenames from string in java?

I have a string that contains file names like:

"file1.txt file2.jpg tricky file name.txt other tricky filenames containing áéíőéáóó.gif"

How can I get the file names, one by one? I am looking for the most safe most through method, preferably something java standard. There has got to be some regular expression already out there, I am counting on your experience.

Edit: expected results: "file1.txt", "file2.jpg", "tricky file name.txt", "other tricky filenames containing áéíőéáóó.gif"

Thanks for the help, Sziro

Upvotes: 1

Views: 152

Answers (4)

    String txt = "file1.txt file2.jpg tricky file name.txt other tricky filenames containing áéíőéáóó.gif";
    Pattern pattern = Pattern.compile("\\S.*?\\.\\S+"); // Get regex from enrico.bacis
    Matcher matcher = pattern.matcher(txt);
    while (matcher.find()) {
        System.out.println(matcher.group().trim());
    }

Upvotes: 0

Miloš Ratković
Miloš Ratković

Reputation: 455

Regular expresion that enrico.bacis suggested (\S.?.\S+)* will not work if there are filenames without characters before "." like .project.

Correct pattern would be:

(([^ .]+ +)*\S*\.\S+)

You can try it here.

Java program that could extract filenames will look like:

String patternStr = "([^ .]+ +)*\\S*\\.\\S+";
String input = "file1.txt .project file2.jpg tricky file name.txt other tricky filenames containing áéíoéáóó.gif";
Pattern pattern = Pattern.compile(patternStr, Pattern.MULTILINE);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {            
    System.out.println(matcher.group());
}

Upvotes: 1

user2696372
user2696372

Reputation: 86

If there are spaces in the file names, it makes it trickier.

If you can assume there are no dots (.) in the file names, you can use the dot to find each individual records as has been suggested.

If you can't assume there are no dots in file names, e.g. my file.new something.txt

In this situation, I'd suggest you create a list of acceptable extentions, e.g. .doc, .jpg, .pdf etc.

I know the list may be long, so it's not ideal. Once you have done this you can look for these extensions and assume what's before it is a valid filename.

Upvotes: 0

enrico.bacis
enrico.bacis

Reputation: 31494

If you want to use regular expressions you can find all the occurrences of:

(\S.*?\.\S+)

(you can test it here)

Upvotes: 1

Related Questions