Dolan Antenucci
Dolan Antenucci

Reputation: 15942

In Java, how can I read data from multiple files using posix wildcard syntax?

Currently, I have a script that loops over System.in for data processing. I am passing data to it from several files with cat.

cat myfiles*.txt | java MyDataProcessor 

Based on the idea that cat adds some inefficiency vs. Java opening the files directly, I'd like to optimize this to where Java opens the files directly:

java MyDataProcessor myfiles*.txt

Are there any Java libraries that make this fairly easy (i.e. that handle the translation of posix wildcards into file handlers)?

Upvotes: 1

Views: 734

Answers (5)

Dolan Antenucci
Dolan Antenucci

Reputation: 15942

In case this isn't obvious to someone, as it wasn't to me at first, if the files are local, then you can let Posix do the parsing for you, and the files will be passed to main(String[] args) as arguments. In my case, I had a few other parameters, so just moved the wildcard argument as the last one.

// USAGE: java MyProcessor arg1 arg2 myfiles*.txt

public static void main(String[] args) throws Exception {
  String arg1 = args[0];
  String arg2 = args[1];

  // looping over all input files
  for (int i = 2; i < args.length; i++) {
    File inputFile = new File(args[i]).getCanonicalFile();
    BufferedReader in = new BufferedReader(new FileReader(inputFile)); 
    // ...
  }
}

Upvotes: 0

Michael Krussel
Michael Krussel

Reputation: 2656

Java 7 added a PathMatcher class that can be used to validate a path name based on a glob (which will be similar to the matching done by your shell)

PathMatcher matcher = FileSystems.getDefault().getPathMatcher("glob:myfiles*.txt");
matcher.matches(filename);

An example of walking a file tree and searching for files based on globs can be found in the Oracle Java tutorials here

Upvotes: 2

Boris Ivanov
Boris Ivanov

Reputation: 4254

Look at Java Grep Library It close to your task but no wildcards.

Apache provide class with wildcards: http://cleanjava.wordpress.com/2012/03/21/wildcard-file-filter-in-java/

Upvotes: 1

ccleve
ccleve

Reputation: 15809

I would use java.io.File to iterate over the entire directory, and then filter the filenames using regular expressions. You can convert a wildcard expression to a regular expression using this code:

    /**
 * Converts wildcard expression to regular expression. In wildcard-format,
 * '*' = 0-N characters and ? = any one character.
 * @param wildcardExp wildcard expression string
 * @param buf buffer which receives the regular expression
 */
static public void wildcardToRegexp(FastStringBuffer wildcardExp, FastStringBuffer buf) {
    final int len = wildcardExp.size();
    buf.clear();
    for (int i = 0; i < len; i++) {
        char c = wildcardExp.charAt(i);
        switch (c) {
        case '*':
            buf.append('.');
            buf.append('*');
            break;
        case '?':
            buf.append('.');
            break;
        // escape special regexp-characters

        case '(':
        case ')':
        case '[':
        case ']':
        case '$':
        case '^':
        case '.':
        case '{':
        case '}':
        case '|':
        case '\\':
        case '+':
            buf.append('\\');
            buf.append(c);
            break;
        default:
            buf.append(c);
            break;
        }
    }
}

Upvotes: 1

Aniket Inge
Aniket Inge

Reputation: 25725

its best to pass the directory name and have Java parse through the directory tree instead of relying on shell-specific "wild-card"s.

Upvotes: 1

Related Questions