Reputation: 735
I want to read a text file that contains the following, which I'm trying to use regex to match and split the contents into different strings for 2 priority queues for making a heap-based priority queue task scheduler. But firstly, I need to make sure that the format is right in the text file which I read using a Scanner, where it starts with a task containing alphanumeric letters, followed by a non-negative integer (the arrival time) and a natural number (the deadline time). The following is the input within the text file with the right format:
task1 2 3 task2 2 3 task3 2 3 task4 4 5 task5 4 5
task6 7 9 task7 7 9 task8 7 9 task9 7 9
task10 7 9 task11 7 9 task12 7 9 task13 7 9
task14 7 9 task15 7 9 task16 10 11 task17 10 11
task18 10 11 task19 10 11 task20 10 12
I tried the following regex code to try and check whether the format is right, but I can only match it up to the first task attributes. I can't seem to match it beyond the first task, meaning when it goes on to the other tasks where the format repeats, then the regex will fail. Any idea what is wrong with my regex?
(^\s*[a-zA-Z0-9]*\s+\d+\s+\d+\s*){1,}
^
starts off with any space \s*
0 or more times
[a-zA-Z0-0]*
is the alphanumeric 0 or more times, referring to the tasks
\s+
is the white spaces between the different task attributes
\d+
is the arrival and deadline times
\s*
ends with white spaces 0 or more times between different tasks
{1,}
after the ()
brackets specify minimum number of repeat is 1, with no specified number for maximum repeats
Upvotes: 0
Views: 2721
Reputation: 88707
The problem is ^
which requires the match to be at the start of the input sequence and any but the first match won't satisfy that condition.
Try to move the first part out of the group:
^\s*([a-zA-Z0-9]*\s+\d+\s+\d+\s*){1,}
Btw, {1,}
can be replaced with a single +
.
Also note that depending on how you apply the regex you'd either not need to wrap the expression with ^
and $
(e.g. String.matches()
or Matcher.matches
which do it implicitly) or you might have to do it (depending on your needs), e.g. add a $
at the end to require that nothing is allowed after the match (if that would violate your file format).
If you want to extract the matches as well, you'd need a slightly different approach, i.e. use Matcher.find()
and remove the last part ({1,}
).
Upvotes: 2