Reputation: 3209
I read this in a book on computability:
(Kleene's Theorem) A language is regular if and only if it can be obtained from finite languages by applying the three operations union, concatenation, repetition a finite number of times.
I am struggling with "finite languages".
Consider this language: L = a*
It is not finite. It is the set {0, a, aa, aaa, ...}
which is clearly an infinite set (0
= the empty string).
So it is an infinite language, right? That is, "infinite set" means "infinite language", right?
Clearly, a*
is a regular language. And it is an infinite language. Thus, by Kleene's Theorem it cannot be a regular language. Contradiction.
I'm confused. I guess that I don't know what "finite language" means.
Upvotes: 12
Views: 35111
Reputation: 129
an infinite language means a set having infinite equivalence classes. However the a* language has only one equivalence class thus making it a finite language.
Upvotes: -2
Reputation: 25034
A finite language is a language containing a finite number of words. The simplest cases are those containing no words at all, the empty string, and a single string consisting of a single symbol (e.g. a
in your example).
I think your confusion comes from misreading the rule you quote (as are some of those commenting on the question).
(Kleene's Theorem) A language is regular if and only if it can be obtained from finite languages by applying the three operations union, concatenation, repetition a finite number of times.
The passage is not talking about the number of operations on strings needed to create all the strings in a language, but about the number of operations on simpler languages needed to define the language being defined. The language you mention is built by starting with a finite language (the set {"a"}) and applying the repetition operator once.
A different and less direct way of putting the point would appeal not to languages and operations on languages, but to expressions denoting languages and more complex expressions combining them: a language is regular if and only if it can be denoted by a regular expression containing a finite number of operators.
Take an expression like a
, denoting the finite language containing just the single word "a". We can add a single repetition operator to that expression, and we get a*
, the infinite language containing every concatenation of zero or more words from the first language. Every finite expression E we can build by starting from expressions denoting finite languages and by combining one or two smaller expressions F and G using the patterns E = F | G, E = FG, or E = F* will denote a regular language. Expressions denoting finite languages (languages with a finite number of words) are the base case when the rule is stated in terms of expressions; finite languages are the base case when the rule is stated directly in terms of languages, without any detours to expression-land.
If we allow union, concatenation, and repetition to be applied infinitely many times (or equivalently if we allow infinite expressions using the rules for regular exressions), the resulting language is not guaranteed regular. This is the analogue at the regular-language level of the observation that infinitely large context-free grammars can define non-context-free languages, as illustrated by van Wijngaarden grammars.
Upvotes: 3
Reputation: 10452
Very shortly, your statement is saying:
A language is regular if and only if there is a regular expression for it.
The "can be obtained from finite languages by applying the three operations union, concatenation, repetition a finite number of times" part is essentially a quick verbal definition of a regular expression. Usually, a regular expression (RE) is formally defined starting with the following base cases:
These are all finite languages. We then obtain other REs by applying the following three recursive rules a finite number of times:
In the end, you can create infinite languages using finite descriptions (a regular expression).
Upvotes: 4
Reputation: 260
You are on the right track, it could be clearer. Kleen's theorem expresses the equivalence of three statements
A language is regular == A language can be expressed by a Regular Expression == A language can be expressed by a finite automata.
Your example is indeed a regular language. A finite language is what you would expect it to be, a language that can be listed in a finite amount of time.
When they are talking about repetition, they are talking about the Kleen Star operation, which is exactly what a*
represents, the set {empty, a, aa, aaa, aaaa, ...}
EDIT:
I have found this link: Kleenes Theorem which helps quite a bit. It by 'repetition' they mean Kleen Star, then the original statement makes sense. a*
is Kleen_Star(a)
Upvotes: 5