user1167910
user1167910

Reputation: 145

Sort files in numeric order

I made a program to combine all files in a folder together.

Here's part of my code:

File folder = new File("c:/some directory");
File[] listOfFiles = folder.listFiles();
for (File file : listOfFiles){

if (file.isFile()){
    System.out.println(file.getName());
    File f = new File("c:/some directory"+file.getName());

However, I hope my files can be in order of like: job1.script, job2.script, .....

but I get: job1.script, job10.script, job11.script, that 10,11,12... are in front of 2.

I hope I can get efficient code that can avoid this problem.

Upvotes: 2

Views: 2104

Answers (4)

skiwi
skiwi

Reputation: 69259

Time to get rid of all the clumpsy code, and use Java 8! This answer also features the Path class, which is already part of Java 7, however seems to be heavily improved in Java 8.

The code:

private void init() throws IOException {
    Path directory = Paths.get("C:\\Users\\Frank\\Downloads\\testjob");
    Files.list(directory)
            .filter(path -> Files.isRegularFile(path))
            .filter(path -> path.getFileName().toString().startsWith("job"))
            .filter(path -> path.getFileName().toString().endsWith(".script"))
            .sorted(Comparator.comparingInt(this::pathToInt))
            .map(path -> path.getFileName())
            .forEach(System.out::println);
}

private int pathToInt(final Path path) {
    return Integer.parseInt(path.getFileName()
            .toString()
            .replace("job", "")
            .replace(".script", "")
    );
}

The explanation of pathToInt:

  1. From a given Path, obtain the String representation of the file.
  2. Remove "job" and ".script".
  3. Try to parse the String as an Integer.

The explanation of init, the main method:

  1. Obtain a Path to the directory where the files are located.
  2. Obtain a lazily populated list of Paths in the directory, be aware: These Paths are still fully qualified!
  3. Keep files that are regular files.
  4. Keep files of which the last part of the Path, thus the filename (for example job1.script) starts with "job". Be aware that you need to first obtain the String representation of the Path before you can check it, else you will be checking if the whole Path starts with a directory called "job".
  5. Do the same for files ending with ".script".
  6. Now comes the fun point. Here we sort the file list based on a Comparator that compares the integers which we obtain by calling pathToInt on the Path. Here I am using a method reference, the method comparingInt(ToIntFunction<? super T> keyExtractor expects a function that maps a T, in this case a Path, to an int. And this is exactly what pathToInt does, hence it can be used a method reference.
  7. Then I map every Path to the Path only consisting of the filename.
  8. Lastly, for each element of the Stream<Path>, I call System.out.println(Path.toString()).

It may seem like this code could be written easier, however I have purposefully written it more verbose. My design here is to keep the full Path intact at all times, the very last part of the code in the forEach actually violates that principle as shortly before it gets mapped to only the file name, and hence you are not able to process the full Path anymore at a later point.

This code is also designed to be fail-fast, hence it is expecting files to be there in the form job(\D+).script, and will throw a NumberFormatException if that is not the case.

Example output:

job1.script
job2.script
job10.script
job11.script

An arguably better alternative features the power of regular expressions:

private void init() throws IOException {
    Path directory = Paths.get("C:\\Users\\Frank\\Downloads\\testjob");
    Files.list(directory)
            .filter(path -> Files.isRegularFile(path))
            .filter(path -> path.getFileName().toString().matches("job\\d+.script"))
            .sorted(Comparator.comparingInt(this::pathToInt))
            .map(path -> path.getFileName())
            .forEach(System.out::println);
}

private int pathToInt(final Path path) {
    return Integer.parseInt(path.getFileName()
            .toString()
            .replaceAll("job(\\d+).script", "$1")
    );
}

Here I use the regular expression "job\\d+.script", which matches a string starting with "job", followed by one or more digits, followed by ".script".
I use almost the same expression for the pathToInt method, however there I use a capturing group, the parentheses, and $1 to use that capturing group.

I will also provide a concise way to read the contents of the files in one big file, as you have also asked in your question:

private void init() throws IOException {
    Path directory = Paths.get("C:\\Users\\Frank\\Downloads\\testjob");
    try (BufferedWriter writer = Files.newBufferedWriter(directory.resolve("masterjob.script"))) {
        Files.list(directory)
                .filter(path -> Files.isRegularFile(path))
                .filter(path -> path.getFileName().toString().matches("job\\d+.script"))
                .sorted(Comparator.comparingInt(this::pathToInt))
                .flatMap(this::wrappedLines)
                .forEach(string -> wrappedWrite(writer, string));
    }
}

private int pathToInt(final Path path) {
    return Integer.parseInt(path.getFileName()
            .toString()
            .replaceAll("job(\\d+).script", "$1")
    );
}

private Stream<String> wrappedLines(final Path path) {
    try {
        return Files.lines(path);
    } catch (IOException ex) {
        //swallow
        return null;
    }
}

private void wrappedWrite(final BufferedWriter writer, final String string) {
    try {
        writer.write(string);
        writer.newLine();
    } catch (IOException ex) {
        //swallow
    }
}

Please note that lambdas cannot throw/catch checked Exceptions, hence there is a neccessity to write wrappers around the code, that decides what to do with the errors. Swallowing the exceptions is rarely a good idea, I am just using it here for code simplicitely.

The real big change here is that instead of printing out the names, I map every file to its contents and write those to a file.

Upvotes: 6

Alexis C.
Alexis C.

Reputation: 93842

If your files' name are always like jobNumber.script you could sort the array providing a custom comparator:

Arrays.sort(listOfFiles, new Comparator<File>(){
        @Override
        public int compare(File f1, File f2) {
            String s1 = f1.getName().substring(3, f1.getName().indexOf("."));
            String s2 = f2.getName().substring(3, f2.getName().indexOf("."));
            return Integer.valueOf(s1).compareTo(Integer.valueOf(s2));  
        }
});

public static void main(String[] args) throws Exception{
        File folder = new File(".");
        File[] listOfFiles = folder.listFiles(new FilenameFilter() {            
            @Override
            public boolean accept(File arg0, String arg1) {
                return arg1.endsWith(".script");
            }
        });
        System.out.println(Arrays.toString(listOfFiles));
        Arrays.sort(listOfFiles, new Comparator<File>(){
            @Override
            public int compare(File f1, File f2) {
                String s1 = f1.getName().substring(3, f1.getName().indexOf("."));
                String s2 = f2.getName().substring(3, f2.getName().indexOf("."));
                return Integer.valueOf(s1).compareTo(Integer.valueOf(s2));  
            }
        });
        System.out.println(Arrays.toString(listOfFiles));
    }

Prints:

[.\job1.script, .\job1444.script, .\job4.script, .\job452.script, .\job77.script]
[.\job1.script, .\job4.script, .\job77.script, .\job452.script, .\job1444.script]

Upvotes: 2

aliteralmind
aliteralmind

Reputation: 20163

The easiest solution is to zero pad all digits lower than 10. Like

job01.script

instead of

job1.script

This assumes no more than 100 files. With more, simply add more zeros.

Otherwise, you'll need analyze and breakdown each file name, and then order it numerically. Currently, it's being ordered by character.

Upvotes: 1

Adam Arold
Adam Arold

Reputation: 30528

The simplest method to solve this problem is to prefix your names with 0s. This is what I did when I had the same problem. So basically you choose the biggest number you have (for example 433234) and prefix all numbers with biggestLength - currentNumLength zeroes.

An example:

Biggest number is 12345: job12345.script.

This way the first job becomes job00001.script.

Upvotes: 0

Related Questions