Regex to recognise a specific language element

Question

I have to create a specific regex, the thing is whatever I've tried wont work out in the end. I've seen similar posts using the negative lookbehind to solve this, but in my version apparently it is not supported.

The regex I have to create should:

recognise identifiers that may start with underscore, followed MANDATORY by a alphabetical character, followed by one or more alphanumeric or underscore characters. It is essential that the string CANNOT END with underscore.

I have tried this _*[a-zA-Z][a-zA-Z0-9_]*[^_]$ but it won't work for all the cases.

Also this solution with the negative lookbehind creates me issues _*[a-zA-Z][a-zA-Z0-9_]*$(?


some examples of accepted cases:

a5
_a5___v3
a5_v2_2

and non accepted

5_v_2
a5_v2_
_5_v_2
_5
a_
a5--v-2

Orace · Accepted Answer

may start with underscore

Use _? not _*

followed MANDATORY by a alphabetical character

[a-zA-Z] looks good depending on what alphabets are acceptable

followed by one or more alphanumeric or underscore characters.

Let's use [a-zA-Z0-9_]* with a * not a + because:

It is essential that the string CANNOT END with underscore

Here is the last chunk [a-zA-Z0-9]

Final result: _?[a-zA-Z][a-zA-Z0-9_]*[a-zA-Z0-9]

Regex to recognise a specific language element

Answers (1)

Related Questions