Jeflopo
Jeflopo

Reputation: 2262

How to improve that regular expression in python?

I need a regular expression that match that criteria:

1 member
2 members
10 members
100 members
1,000 members
10,000 members
100,000 members
100,000,000 members
999,999,999,999 members

So I did:

\d+ member|
\d+ members|
\d+,\d+ members|
\d+,\d+,\d+ members|
\d+,\d+,\d+,\d+ members

You can see it Interactively here: https://regex101.com/r/oW3bJ6/2

But deep in my heart I now this is very ugly. Could you guys/girls help me find an elegant solution ?

Upvotes: 0

Views: 68

Answers (5)

Sede
Sede

Reputation: 61225

\d+[,\d\s]+members?
  • \d+ match a digit [0-9]
  • [,\d\s]+ match a single character present in the list below , the literal character , \d match a digit [0-9] and \s match any white space character [\r\n\t\f ]

Upvotes: 1

Julian
Julian

Reputation: 1736

I'm not sure how pedantic you need your expression to be, but your accepted answer will give you some false positives with respect to your example. i.e., the following lines, among others, will match; whether that is acceptable is up to you:

1 members       # Plural members for '1'
5 member        # Non-plural member
1000,0 members  # Invalid comma separator
1000000 members # Missing comma separator
00000 members   # Multiple zeros (or any other number)
010 member      # Leading zeros
1, 1 member     # Invalid

The following regular expression will match the exact pattern stated in your example:

^1 member|^[1-9]\d{0,2}(,\d{3})* members

^ ensures we match starting at the beginning of the line.

1 member is a special, non-plural case

[1-9]\d{0,2} matches the numbers 1-999, but not expressions with a leading 0 (such as 0 or 010) ...

(,\d{3})* followed by any number of groups of ',000-999'

Upvotes: 0

heemayl
heemayl

Reputation: 42007

You can also try this:

(\d|,)+ members?

At first, (\d|,)+ will match any decimal digit or , one or more times, then the regex will match a space, then member or members (? means the s can occur 0 or 1 time).

Upvotes: 1

Kevin
Kevin

Reputation: 30151

Why not just this?

\d+(?:,\d+)* members?

If you prefer to verify the digits are in groups of three:

\d+(?:,\d{3})* members?

(edited to add ? after s per Fredrik in the comments)

Upvotes: 2

Seamus
Seamus

Reputation: 4819

This will match everything in the list:

\d+(,\d{3})* member(s)?

But it will also match: 1 members

Is that acceptable? If not, you could use:

1 member| \d+(,\d{3})* members

Upvotes: 0

Related Questions