Reputation: 4592
I'm trying to use the Scanner class in Java to get data from a configuration file. The file's elements are delimited by whitespace. However, if a phrase or element should be interpreted as a string literal (including whitespace), then double or single-quotes are places around the element. This gives files that look like this:
> R 120 Something AWord
> P 160 SomethingElse "A string literal"
When using the Java Scanner class, it delimits by just whitespace by default. The Scanner class has the useDelimiter() function that takes a regular expression to specify a different delimiter for the text. I'm not good with regular expressions, however, so I'm not sure how I'd do this.
How can I delimit by whitespace, unless there are quotes surrounding something?
Upvotes: 3
Views: 3974
Reputation: 33029
You can use the scanner.findInLine(pattern)
method to specify that you want to keep string literals from being split. You just need a regular expression that will match a quote-less token or one in quotes. This one might work:
"[^\"\\s]+|\"(\\\\.|[^\\\\\"])*\""
(That regex is extra complicated because it handles escapes inside the string literal.)
Example:
String rx = "[^\"\\s]+|\"(\\\\.|[^\\\\\"])*\"";
Scanner scanner = new Scanner("P 160 SomethingElse \"A string literal\" end");
System.out.println(scanner.findInLine(rx)); // => P
System.out.println(scanner.findInLine(rx)); // => 160
System.out.println(scanner.findInLine(rx)); // => SomethingElse
System.out.println(scanner.findInLine(rx)); // => "A string literal"
System.out.println(scanner.findInLine(rx)); // => end
The findInLine
method, as the name suggests, only works within the current line. If you want to search the whole input you can use findWithinHorizon
instead. You can pass 0
in as the horizon to tell it to use an unlimited horizon:
scanner.findWithinHorizon(rx, 0);
Upvotes: 5