Facundo Ch.
Facundo Ch.

Reputation: 12234

Algorithm to generate a Turing Machine from a Regular Expression

I'm developing a software to generate a Turing Machine from a regular expression. In other words, I want to take a regular expression as input, and programmatically generate a Turing Machine to perform the same task.

This is what I have done:

I've modelled the regular expression as follows:

When it comes to Turing machine (TM) generation, I have different levels of complexity:

Simple: this type of expression is already working. Generates a new state for each letter and moves right. If in any state, the letter read is not the expected, it starts a "rollback circuit" that finishes with the TM head in the initial position and stops in a not final state.

ComplexWithoutOr: here comes my problem. The algorithm generates a TM for each subexpression and concatenates them. This works for some simple cases, but I have problems with the rollback mechanism.

Problem description

Here is an example input for which my algorithm fails:

(ab)*abac: this is a ComplexWithoutOr expression that contains a ComplexWithOr expression (ab)* (that has a Simple expression inside: ab) and a Simple expression abac

My algorithm generates first an TM1 for ab. This TM1 is used by TM2 for (ab)*, so if TM1 succeeds, it enters again in TM1, otherwise TM1 rolls back and TM2 finishes with success. In other words, TM2 cannot fail.

Then, it generates a TM3 for abac. The output of TM2 is the input of TM3. The output of TM3 is the result of the execution.

Now, suppose this input string: abac. As you can see it matches with the regular expression. So let's see what happens when the TM is executed.

TM1 executes with success the first time: ab. TM1 fails the second time for ac and rolls back, putting the TM head in the 3rd position at a. TM2 finishes with success and the input is forwarded to TM3. TM3 fails, because ac doesn't match abac. So the TM does not recognize abac.

This is a problem. How to solve this?

I'm using Java to develop it, but the language it is not important. My question concerns the algorithm.

Upvotes: 3

Views: 4982

Answers (3)

Confusion
Confusion

Reputation: 16861

Michael Sipser, in Introduction to the Theory of Computation, proves in chapter 1 that regular expressions are equivalent to finite automata in their descriptive power. Part of the proof involves constructing a nondeterministic finite automaton (NDFA) that recognizes the language described by a specific regular expression. I'm not about to copy half that chapter, which would be quite hard due to the notation used, so I suggest you borrow or purchase the book (or perhaps a Google search using these terms will turn up a similar proof) and use that proof as the basis for your algorithm.

As Turing machines can simulate an NDFA, I assume an algorithm to produce an NDFA is good enough.

Upvotes: 1

MAK
MAK

Reputation: 26586

It is not entirely clear to me what exactly you are trying to implement. It looks like you want to make a Turing Machine (or any FSM in general) that accepts only those strings that are also accepted by the regular expression. In effect, you want to convert a regular expression to a FSM.

Actually that is exactly what a real regex matcher does under the hood. I think this series of articles by Russ Cox covers a lot of what you want to do.

Upvotes: 2

guest
guest

Reputation: 435

in the chomsky hierarchy a regex is Level3, whereas a TM is Level1. this means, that a TM can produce any regex, but not vice versa.

Upvotes: -1

Related Questions