Vishnu Murthy
Vishnu Murthy

Reputation: 51

Make sure regex does not match empty string - but with a few caveats

There is a problem that I need to do, but there are some caveats that make it hard.

Problem: Match on all non-empty strings over the alphabet {abc} that contain at most one a.

Examples

a
abc
bbca
bbcabb

Nonexample

aa
bbaa

Caveats: You cannot use a lookahead/lookbehind.

What I have is this:

^[bc]*a?[bc]*$

but it matches empty strings. Maybe a hint? Idk anything would help

(And if it matters, I'm using python).

Upvotes: 0

Views: 494

Answers (4)

user557597
user557597

Reputation:

You've got to positively match something excluding the empty string,
using only a, b, or c letters. But can't use assertions.

Here is what you do.

The regex ^(?:[bc]*a[bc]*|[bc]+)$

The explanation

 ^                      # BOS
 (?:                    # Cluster choice
      [bc]* a [bc]*          # only 1 [a] allowed, arbitrary [bc]'s
   |                       # or,
      [bc]+                  # no [a]'s only [bc]'s ( so must be some )
 )                      # End cluster 
 $                      # EOS

Upvotes: 0

bobble bubble
bobble bubble

Reputation: 18535

As I understand your question, the only problem is, that your current pattern matches empty strings. To prevent this you can use a word boundary \b to require at least one word character.

^\b[bc]*a?[bc]*$

See demo at regex101

Another option would be to alternate in a group. Match an a surrounded by any amount of [bc] or one or more [bc] from start to end which could look like: ^(?:[bc]*a[bc]*|[bc]+)$

Upvotes: 1

Jan
Jan

Reputation: 43169

You do not even need a regex here, you might as well use .count() and a list comprehension:

data = """a,abc,bbca,bbcabb,aa,bbaa,something without the bespoken letter,ooo"""

def filter(string, char):
    return [word 
        for word in string.split(",")
        for c in [word.count(char)]
        if c in [0,1]]

print(filter(data, 'a'))

Yielding

['a', 'abc', 'bbca', 'bbcabb', 'something without the bespoken letter', 'ooo']

Upvotes: 0

K.Dᴀᴠɪs
K.Dᴀᴠɪs

Reputation: 10139

The way I understood the issue was that any character in the alphabet should match, just only one a character.

Match on all non-empty strings over the alphabet... at most one a

^[b-z]*a?[b-z]*$

If spaces can be included:

^([b-z]*\s?)*a?([b-z]*\s?)*$

Upvotes: 0

Related Questions