F_Schmidt
F_Schmidt

Reputation: 1132

Having problems with java regex

I have the following regex:

/[-A-Z]{4}\d{2}/[0-9A-F]{8}-[0-9A-F]{4}-[0-9A-F]{4}-[0-9A-F]{4}-[0-9A-F]{12}.png

Basically I want to check for strings of the basic type

ABCD12/<here_is_a_random_uuid_as_a_string>.png

The UUID (which is in UPPER CASE) checking works fine, but now let's take a look at a special case. I want to accept strings like this

--CD12/...
AB--12/...

but NOT like this:

A--D12/...

But I can not get the first part of the regex right. Basically I need to check for either two digits or two -after each other twice.

For my understanding [-A-Z]{4} means "either - or something between A - Z with a length of 4". So why doesn't my pattern work?

EDIT: This answer was posted within the comments and it works:

(?mi)^(?:--[A-Z]{2}|[A-Z]{2}(?:--|[A-Z]{2}))\d{2}/[0-9A-F]{8}(?:-[0-9A-F]{4}){3}-[0-9A-F]{12}\.png$

Can somebody explain to me what (?mi) and what (?:...) means? The normal ? means 0 or 1 time, but what is the : for?

EDIT 2: Just for those how might have a similar problem and do not want to read all of those regexes ;) I slightly modified an answer to also accept patterns like ----12. The end result:

"^/(?:--[A-Z]{2}|-{4}|[A-Z]{2}(?:--|[A-Z]{2}))\\d{2}/[0-9A-F]{8}(?:-[0-9A-F]{4}){3}-[0-9A-F]{12}\\.png$"

It works like a charm.

Upvotes: 3

Views: 80

Answers (2)

anubhava
anubhava

Reputation: 786359

You may use this regex for your cases:

^(?:--[A-Z]{2}|[A-Z]{2}(?:--|[A-Z]{2}))\d{2}/[0-9A-F]{8}(?:-[0-9A-F]{4}){3}-[0-9A-F]{12}\.png$

RegEx Demo

Details about first part:

  • ^: Start
  • (?:: Start non-capture group
    • --[A-Z]{2}: Match -- followed by 2 letters
    • |: OR
    • [A-Z]{2}: Match 2 letters
    • (?:--|[A-Z]{2}): Match -- OR 2 letters
  • ): End non-capture group

btw (?:...) is non-capture group.

Upvotes: 2

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627607

Your [-A-Z]{4} matches any four occurrences of an uppercase ASCII letter or -, so it can also match ----, A---, ---B, -B--, etc.

You want to make sure that if there are hyphens, they come after or before two letters:

(?:[A-Z]{2}--|--[A-Z]{2}|[A-Z]{4})

It means:

  • (?: - start of a non-capturing group:
    • [A-Z]{2}-- - two uppercase ASCII letters and then --
    • | - or
    • --[A-Z]{2} - -- and then any two uppercase ASCII letters
    • | - or
    • [A-Z]{4} - any four uppercase ASCII letters
  • ) - end of the non-capturing group.

The full pattern:

(?:[A-Z]{2}--|--[A-Z]{2}|[A-Z]{4})\d{2}/[0-9A-F]{8}-[0-9A-F]{4}-[0-9A-F]{4}-[0-9A-F]{4}-[0-9A-F]{12}\.png

To force the entire string match, add ^ (start of string) and $ (end of string) anchors:

^(?:[A-Z]{2}--|--[A-Z]{2}|[A-Z]{4})\d{2}/[0-9A-F]{8}-[0-9A-F]{4}-[0-9A-F]{4}-[0-9A-F]{4}-[0-9A-F]{12}\.png$

See the regex demo

Note the . matches any char, to match a literal dot, you should escape it.

Upvotes: 1

Related Questions