raywrigh
raywrigh

Reputation: 11

Construct a grammar over {a,b} whose language is

Construct a grammar over {a,b} whose language is

{a^m b^n | 0 <= n <= m <= 3n}

I am not sure how to go about this problem, I started out by doing

n >= 0
m >= n
3n >= m

S -> a S b | a S bb

Upvotes: 1

Views: 1086

Answers (2)

Patrick87
Patrick87

Reputation: 28292

I find it's helpful to start with the shortest strings in the language, and then to try to generalize from there. What is the shortest string in this language? We can choose n = 0 and m = 0, since 0 <= 0 <= 0 <= 3x0, so the empty string is in our language. Because the empty string is in the language, we must have a production in our grammar like S := e.

Now, how do we get longer strings in the language? We could add some productions just for adding a to our strings, or just for adding b; however, such rules could not be used to extend our base case (S := e) since adding a only, or b only, to the empty string will not get a string in the language. Those productions might get us to a string in the language, but they won't keep us on such a path in an obvious or simple way. We want simple productions in our grammar so we can be confident it's correct.

The alternative to adding a and b separately is to add them together. Note that when you have languages where parts of the strings are dependent on each other, you will typically (always except in the case of apparent but not actual dependencies) find that the productions must relate those dependent parts; otherwise, the dependency can be "forgotten" during the derivation of a string. We hypothesize that our productions should only ever add a and b together and proceed on that hypothesis.

First, we might guess that the production S := aSb should be included in our grammar. Why might we guess this? Well, based on our hypothesis, we know we need to add a and b together, which we do here. Also, since we want a rule to produce strings of general length, we understand that the production must involve a non-terminal, of which we currently have just the one (recall also that we are endeavoring to build strings on our base case, which is produced from S; so using this nonterminal is natural). Finally, we know that all a must be to the left of all b, and this is the only ordering of three symbols that produces strings according to that pattern.

What strings does this production allow us to generate? We can get S := e, S := aSb := ab, S := aSb := aaSbb := aabb, …, S := … := (a^n)(b^n). We observe that these strings are in the language - since 0 <= n <= n <= 3n - but that there are strings that this production by itself misses. In such cases, the production is fine and should be kept; however, that we missed some strings indicates we need to find other productions to cover those cases.

Which cases did we miss? We missed the case where m is strictly greater than n. In the production we guessed, we used the same number of a and b, so we end up with strings that have the same number. It stands to reason that if we want strings with more a than b, then we need to have productions that introduce more a than b. By our hypothesis, we still require that we introduce at least one of both; and we already introduced one with one of each. The next logical production to guess is S := aaSb. What strings can we generate now? If we ignore the production S := aSb, this new production allows us to generate strings of the form (a^2n)(b^n) of strings in our language; but what happens when we consider all three productions?

Consider the string (a^k)(b^n) where n <= k <= 2n. If k = n then we can use the production S := aSb n times to get the string; likewise, if k = 2n, we can use the production S := aaSb n times. What if n < k < 2n? Suppose that k = n + j, where 0 < j < n. In that case, we can use the production S := aaSb exactly j times, and the production S := aSb exactly n - j times, to get 2j + n - j = n + j = k instances a and n instances of b. Therefore, these three rules together allow us to generate all strings where the number of a is between the number of b and twice the number of b, inclusive on both ends.

We still cannot generate strings where the number of a is more than twice the number of b; however, based on our above successes, we can guess S := aaaSb and use similar reasoning to show that these four productions together give us exactly the language we set out to generate. The grammar we have arrived at is the following:

S := e
S := aSb
S := aaSb
S := aaaSb

There are lots of other grammars for this language, all correct, and many arrived at using completely other methods than this. Do not think of problems like these as applying a formula to get a predetermined answer. Think of problems like this as what they are: programming problems. You are given the spec, and as long as your stuff works, that's what counts.

Upvotes: 1

rici
rici

Reputation: 241671

That's very close. There are a few problems:

  • You need a base case. Since n could be zero, ε (the empty string) is in the language. So that should tell you where to start.

  • You seem to think that there are more bs than as. But the number of bs (n) is less than or equal to the number of as (m). There cannot be more bs. Instead of adding one or two bs for each a, you need to add one or two as for each b. (But see below.)

  • You only handle the cases where an additional a is paired with one or two bs, which as mentioned above should be reversed so that an additional b is paired with some as. But the language description says m≤3*n*, not m≤2*n*; an additional b could be paired with up to three as.

Upvotes: 1

Related Questions