Reputation: 819
I've got the following regex:
^[EC]D_V_[a-zA-Z]{5}____([0-9]{8})_[0-9]{3}_[a-zA-Z](_[0-9]{1,7})?\.([^<>:\\”\/\\|\\*\\?]{3,4})(\.gz)?
and this testdata:
CD_V_DoSto____00000000_255_A_952086.445
ED_V_DoSto____99999999_255_A_91459._416.gz
Why is the second one failing, but if I edit the first file to CD_V_DoSto____00000000_255_A_952086.445.gz
it's working.
I think the 0-9{8}
is causing the problem, but I couldnt verify it...
Here you can test it: regex101
Upvotes: 2
Views: 98
Reputation: 627086
There are three things to consider here:
^
anchor only matches the start of the string by default, to make it match the line start, you need to prepend the pattern with (?m)
or use the m
multiline option{3,4}
quantifier is greedy, the .
before gz
gets matched and .gz
does not fall into Group 4. You should add .
to the negated character set$
in the regex tester. In Java matches
method, you do not need to use ^
or $
to match the whole string, the match is anchored by default.See the fixed regex fiddle:
^[EC]D_V_[a-zA-Z]{5}____([0-9]{8})_[0-9]{3}_[a-zA-Z](_[0-9]{1,7})?\.([^.<>:”\/|*?]{3,4})(\.gz)?$
In Java, you may use
Boolean isValid = s.matches("[EC]D_V_[a-zA-Z]{5}____([0-9]{8})_[0-9]{3}_[a-zA-Z](_[0-9]{1,7})?\\.([^.<>:”/|*?]{3,4})(\\.gz)?");
Upvotes: 3