mahaidery
mahaidery

Reputation: 631

Interesting easy looking Regex

I am re-phrasing my question to clear confusions!

I want to match if a string has certain letters for this I use the character class:

[ACD]

and it works perfectly!

but I want to match if the string has those letter(s) 2 or more times either repeated or 2 separate letters

For example: [AKL] should match:

ABCVL
AAGHF
KKUI
AKL

But the above should not match the following:

ABCD
KHID
LOVE

because those are there but only once!

that's why I was trying to use:

[ACD]{2,}

But it's not working, probably it's not the right Regex.. can somebody a Regex guru can help me solve this puzzle?

Thanks

PS: I will use it on MYSQL - a differnt approach can also welcome! but I like to use regex for smarter and shorter query!

Upvotes: 0

Views: 244

Answers (8)

Enigmadan
Enigmadan

Reputation: 3407

Edit

Overall, MySQL regular expression support is pretty weak.

If you only need to match your capture group a minimum of two times, then you can simply use:

select *  from ... where ... regexp('([ACD].*){2,}') #could be `2,` or just `2`

If you need to match your capture group more than two times, then just change the number:

select *  from ... where ... regexp('([ACD].*){3}')
                                      #This number should match the number of matches you need

If you needed a minimum of 7 matches and you were using your previous capture group [ACDF-KM-XZ]

e.g.

select *  from ... where ... regexp('([ACDF-KM-XZ].*){7,}')

Response before edit:

Your regex is trying to find at least two characters from the set[ACDFGHIJKMNOPQRSTUVWXZ].

([ACDFGHIJKMNOPQRSTUVWXZ]){2,}

The reason A and Z are not being matched in your example string (ABCDEFGHIJKLMNOPQRSTUVWXYZ) is because you are looking for two or more characters that are together that match your set. A is a single character followed by a character that does not match your set. Thus, A is not matched.

Similarly, Z is a single character preceded by a character that does not match your set. Thus, Z is not matched.

The bolded characters below do not match your set
ABCDEFGHIJKLMNOPQRSTUVWXYZ

If you were to do a global search in the string, only the italicized characters would be matched:
ABCDEFGHIJKLMNOPQRSTUVWXYZ

Upvotes: 0

Ondra
Ondra

Reputation: 1647

Is this what you are looking for?

".*(.*[AKL].*){2,}.*" (without quotes)

It matches if there are at least two occurences of your charactes sorrounded by anything. It is .NET regex, but should be same for anything else

Upvotes: 0

Sam
Sam

Reputation: 20486

If I understood you correctly, this is quite simple:

[A-Z].*?[A-Z]

This looks for your something in your set, [A-Z], and then lazily matches characters until it (potentially) comes across the set, [A-Z], again.


As @Enigmadan pointed out, a lazy match is not necessary here: [A-Z].*[A-Z]

Upvotes: 2

Vajura
Vajura

Reputation: 1132

pretty sure this should work in any case

(?<l>[^AKL\n]*[AKL]+[^AKL\n]*[AKL]+[^AKL\n]*)[\n\r]

replace AKL for letters you need can be done very easily dynamicly tell me if you need it

Upvotes: 0

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89584

To ensure that a string contains at least two occurencies in a set of letters (lets say A K L as in your example), you can write something like this:

[AKL].*[AKL]

Since the MySQL regex engine is a DFA, there is no need to use a negated character class like [^AKL] in place of the dot to avoid backtracking, or a lazy quantifier that is not supported at all.

example:

SELECT 'KKUI' REGEXP '[AKL].*[AKL]';

will return 1

You can follow this link that speaks on the particular subject of the LIKE and the REGEXP features in MySQL.

Upvotes: 5

alpha bravo
alpha bravo

Reputation: 7948

your question is not very clear, but here is my trial pattern

\b(\S*[AKL]\S*[AKL]\S*)\b  

Demo

Upvotes: 0

Indu Devanath
Indu Devanath

Reputation: 2188

If you want 2 or more of a match on [AKL], then you may use just [AKL] and may have match >= 2.

I am not good at SQL regex, but may be something like this?

check (dbo.RegexMatch( ['ABCVL'], '[AKL]' ) >= 2)

To put it in simple English, use [AKL] as your regex, and check the match on the string to be greater than 2. Here's how I would do in Java:

private boolean search2orMore(String string) {
    Matcher matcher = Pattern.compile("[ACD]").matcher(string);
    int counter = 0;
    while (matcher.find())
    {
        counter++;
    }
    return (counter >= 2);
}

You can't use [ACD]{2,} because it always wants to match 2 or more of each characters and will fail if you have 2 or more matching single characters.

Upvotes: 0

Dalorzo
Dalorzo

Reputation: 20024

The expression you are using searches for characters between 2 and unlimited times with these characters ACDFGHIJKMNOPQRSTUVWXZ.

However, your RegEx expression is excluding Y (UVWXZ])) therefore Z cannot be found since it is not surrounded by another character in your expression and the same principle applies to B ([ACD) also excluded in you RegEx expression. For example Z and A would match in an expression like ZABCDEFGHIJKLMNOPQRSTUVWXYZA

If those were not excluded on purpose probably better can be to use ranges like [A-Z]

Upvotes: 0

Related Questions