anon
anon

Reputation:

Is it possible to simplify this regular expression any further?

I'm working on some homework for my compiler class and I have the following problem:

Write a regular expression for all strings of a's and b's that contain an odd number of a's or an odd number of b's (or both).

After a lot of whiteboard work I came up with the following solution:

(aa|bb)* (ab|ba|a|b) ((aa|bb)* (ab|ba) (aa|bb)* (ab|ba) (aa|bb)*)*

However, Is this is the most simplified I can get it? I've considered constructing the DFA trying to minimize the number of states there to see if it would help me simplify but I figured I would ask the regex gurus on SO first.

Upvotes: 6

Views: 1614

Answers (4)

sepp2k
sepp2k

Reputation: 370172

This should work:

b* a b* (a b* a b*)* |  a* b a* (b a* b a*)*

Upvotes: 5

Brad Gilbert
Brad Gilbert

Reputation: 34120

I think you need to approach the problem differently.

You are trying to match anything that doesn't have even numbers of both a and b.

It would probably be easier to start with something that matches even numbers of a and b. All you have to do at that point would be to add something on the end that matches the smallest string that you actually want to match.

Upvotes: 0

Greg D
Greg D

Reputation: 44076

I'm afraid that I don't believe your regex as written is correct. Consider the string:

aba

We have a couple choices for matches, but the fact that it's odd-length means we must match a lone a at the front, so:

(a)(ba)

But, sadly, it's impossible for your second main grouping there to match (ba).

When dealing with a constraint like this, I found it easier to start from the core constraint and go from there. In this case, your constraint is "odd," so start with

a(aa)*

to force an odd number of a's and go from there. :)

Upvotes: 2

Walt W
Walt W

Reputation: 3349

Take Greg D's recommendation of starting with a(aa)*, and going from there. Sepp2k almost has it right, but the real consideration is that you don't care about the other letter. What I mean is, when you are looking at the "odd number of a's" constraint, you don't care at all about what b's are in your string. Thus, stick b*'s anywhere you can :)

Sepp2k's answer is almost right, but this one is correct:

b* a b* (a b* a b* )* | a* b a* (b a* b a* )*

To elaborate, this regex figures out all strings with an odd number of a's (first section), and OR's those strings with any strings containing an odd number of b's.

Upvotes: 8

Related Questions