GENERALE
GENERALE

Reputation: 37

Regex to recognise a specific language element

I have to create a specific regex, the thing is whatever I've tried wont work out in the end. I've seen similar posts using the negative lookbehind to solve this, but in my version apparently it is not supported.

The regex I have to create should:

recognise identifiers that may start with underscore, followed MANDATORY by a alphabetical character, followed by one or more alphanumeric or underscore characters. It is essential that the string CANNOT END with underscore.

I have tried this _*[a-zA-Z][a-zA-Z0-9_]*[^_]$ but it won't work for all the cases.

Also this solution with the negative lookbehind creates me issues _*[a-zA-Z][a-zA-Z0-9_]*$(?<!_)

some examples of accepted cases:

  1. a5
  2. _a5___v3
  3. a5_v2_2

and non accepted

  1. 5_v_2
  2. a5_v2_
  3. _5_v_2
  4. _5
  5. a_
  6. a5--v-2

Upvotes: 0

Views: 33

Answers (1)

Orace
Orace

Reputation: 8359

may start with underscore

Use _? not _*

followed MANDATORY by a alphabetical character

[a-zA-Z] looks good depending on what alphabets are acceptable

followed by one or more alphanumeric or underscore characters.

Let's use [a-zA-Z0-9_]* with a * not a + because:

It is essential that the string CANNOT END with underscore

Here is the last chunk [a-zA-Z0-9]

Final result: _?[a-zA-Z][a-zA-Z0-9_]*[a-zA-Z0-9]

Upvotes: 1

Related Questions