Deividas Sutkus
Deividas Sutkus

Reputation: 55

Java Scanner Dilimiter

I'm using Scanner and a Delimiter to tokenize my .txt file (it's a homework that I've got to do). First version of the file looks like this:

5,5,5,6,5,8,9,5,6,8, good, very good, excellent, good
7,7,8,7,6,7,8,8,9,7,very good, Good, excellent, very good
8,7,6,7,8,7,5,6,8,7 ,GOOD, VERY GOOD, GOOD, AVERAGE
9,9,9,8,9,7,9,8,9,9 ,Excellent, very good, very good, excellent
7,8,8,7,8,7,8,9,6,8 ,very good, good, excellent, excellent
6,5,6,4,5,6,5,6,6,6 ,good, average, good, good
7,8,7,7,6,8,7,8,6,6 ,good, very good, good,  very good
5,7,6,7,6,7,6,7,7,7  ,excellent, very good, very good, very good

And I've used useDelimiter("[ ]*(,)[ ]*") second version of the file looks like this:

5 5 5 6 5 8 9 5 6 8 good, very good, excellent, good
7 7 8 7 6 7 8 8 9 7 very good, Good, excellent, very good
8 7 6 7 8 7  5 6 8 7 GOOD, VERY GOOD, GOOD, AVERAGE
9 9 9 8 9 7 9  8 9 9 Excellent, very good, very good, excellent
7 8 8 7 8 7 8 9 6 8 very good, good, excellent, excellent
6 5 6 4 5 6 5 6 6 6 good, average, good, good
7  8 7 7 6 8 7 8 6 6 good, very good, good,  very good
5 7 6 7 6 7 6 7 7 7  excellent, very good, very good, very good

And I can't come up with a regexp which would help me to separate numbers by space and words by comma. Esentially I need an array with 14 values (very good being a single variable)

Note there are multiple spaces (this is done on purpose to make it harder for us).

So any sort of help would be appreciated.

P.S. We're only allowed to use Delimiters only (no splits etc..)

Upvotes: 5

Views: 185

Answers (4)

matts
matts

Reputation: 6887

Note that Scanner allows you to change the delimiter at any time. If you can rely on your input text always having 10 numbers at the beginning and 4 word groups at the end, then you can simply start with a delimiter that just splits on spaces (\s+) and after 10 calls to nextInt(), switch to a delimiter that splits on a comma and spaces (\s*,\s*).

Something like:

String input = "5 5 5 6 5 8 9 5 6 8 good, very good, excellent, good";
Scanner scanner = new Scanner(input).useDelimiter("\\s+");
int[] results = new int[14];
for (int i = 0; i < 10; ++i) {
    results[i] = scanner.nextInt();
}
scanner.useDelimiter("\\s*,\\s*");
scanner.skip("\\s*");
for (int i = 10; i < 14; ++i) {
    String wordPhrase = scanner.next();
    int wordValue;
    if ("average".equalsIgnoreCase(wordPhrase))
        wordValue = 1;
    else if ("good".equalsIgnoreCase(wordPhrase))
        wordValue = 2;
    else if ("very good".equalsIgnoreCase(wordPhrase))
        wordValue = 3;
    else if ("excellent".equalsIgnoreCase(wordPhrase))
        wordValue = 4;
    else
        wordValue = 0;
    results[i] = wordValue;
}

It's also possible to do this with a single delimiter regex using zero-width lookaround assertions, but this is probably a bit advanced for a simple homework problem.

Upvotes: 2

ach
ach

Reputation: 6234

This should work, the key is the positive-lookbehind ((<?=)) and alternation (|):

String input = "9 9 9 8 9 7 9  8 9 9 Excellent, very good, very good, excellent";
Scanner s = new Scanner(input).useDelimiter("(?<=\\d)[\\s,]+|\\s*,\\s*");
while (s.hasNext()) {
    System.out.println("Token: ." + s.next() + ".");
}

Prints:

Token: .9.
Token: .9.
Token: .9.
Token: .8.
Token: .9.
Token: .7.
Token: .9.
Token: .8.
Token: .9.
Token: .9.
Token: .Excellent.
Token: .very good.
Token: .very good.
Token: .excellent.

Upvotes: 4

Alexey A.
Alexey A.

Reputation: 1419

You can try this one (((?<=[0-9]+)\s*(?=[0-9]+))|(,\s*(?=[a-zA-Z]+))|((?<=[0-9]+)\s*(?=[a-zA-Z]+))), looks awful but should work

Upvotes: 2

Achintya Jha
Achintya Jha

Reputation: 12843

String[] str = expression.split("(,\\s+)|(\\s+)");

Try this:

Upvotes: 0

Related Questions