Reputation: 2628
I want to find the alphanumeric words in lucene automata regex but not entirely numeric and even not entirely alphabets. I have tried
(([a-zA-Z0-9]{1,10})&(.*[0-9].*))
but this returns all numeric words also So i tried to negate all numeric like below but it does not work
(^[0-9])(([a-zA-Z0-9]{1,10})&(.*[0-9].*))
Input String:
Expected output: DL200 and dal2
but it should not return 700091
Upvotes: 0
Views: 1527
Reputation: 75950
Didn't know much about lucene regex flavor, but a little research tought me that it does not support PCRE
library, however some standard operators are supported. I found that it does not include lookarounds nor word boundaries. Have a look at the docs.
Either way, to overcome the lack of support on lookarounds I had a look at this older SO post to use ~
instead. Furthermore, I see you can use the &
operator to check if the string matches multiple patterns.
This makes for the assumption the following pattern might work for you:
~[0-9]+&~[^0-9]+&[A-Za-z0-9]{2,10}
~[0-9]+
- Negate a string made of numbers only.&
~[^0-9]+
- Negate a string made of non-numbers only.&
[A-Za-z0-9]{2,10}
- Matches a string that is made out of 2 to 10 alphanumeric characters.Upvotes: 1
Reputation: 2628
With the help of the JvdV
answer and with the help of https://stackoverflow.com/a/38665819/9758194, I was able to get the desired output
(([a-zA-Z0-9]{1,10})&(.*[0-9].*))&~([0-9]*)
Upvotes: 1