Reputation: 373
I am trying to find an algorithm to extract a language from a DFA. If that language is infinite then I only need to report up to : maxcount
<= 1000 strings. there are no restrictions on what order the are reported.
I have tried:(I don't know how to write algorithms so I just explain in words what I did)
a recursive algorithm - for each transition from one state to the next state/states save the transition character and then make a recursive call for each transition,if that transition was to an accepting state then report .if we have reached a "dead end" then there is noting to recurse on.
suppose the start state is state 1
and there are two transitions to: state 2
and state 3
with transition characters 'a'
,'b'
respectively. Then there is a transition from state 3
to state 4
with transition character 'c'
, only state 4
is an accepting state.
going from state 1
to state 2
will give save 'a'
as transition character , but since state 2
is a dead end there will be noting to recurse on and since it is not an accepting state then there is noting to report.
Going from state 1
to state 3
will save 'b'
and then go from state 3
to state 4
will save 'c'
so that we have a total of "bc"
and since state 4
does not have any transitions the recursion ends. We report "bc"
because state 4
is an accepting state
If I have DFA with a loop - then in order to not get "stuck" in that loop I have made sure that "if" there is a another possible transition then we will always pick that transition and not the transition we made last time/times we where at that state ( a kind of memory for which transition we made at which states )
The algorithm works for small DFAs but it will give a Stack overflow on bigger DFA:s (imagine 20 transitions from state 1
to state 2
and 30 transitions from state 2
to state 3
and so on ).
Can anybody recommend more efficient algorithm?
Upvotes: 2
Views: 1276
Reputation: 46990
If you use a BFS, then cycles don't matter. The idea is to define search nodes that contain the current state and a pointer to the predecessor node. Therefore when you visit a node for an accepting state, you can trace the previous pointers backward to determine the accepted string (in reverse). It turns out that its elegant if a search node also contains the character that caused the transition from the previous node's state to the current one.
Here's an approach in Java:
import java.util.ArrayDeque;
import java.util.ArrayList;
import java.util.Deque;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
class Experimental {
// DFA state with its transitions, possibly accepting.
static class State {
final Map<Character, State> transitions = new HashMap<>();
final boolean accept;
State(boolean accept) {
this.accept = accept;
}
}
// A little DFA.
static final State s0 = new State(false);
static final State s1 = new State(false);
static final State s2 = new State(true);
static final State s3 = new State(true);
static {
s0.transitions.put('a', s1);
s0.transitions.put('b', s2);
s0.transitions.put('c', s3);
s1.transitions.put('d', s3);
s2.transitions.put('e', s0);
s2.transitions.put('f', s1);
}
// An enumerator of strings accepted by the DFA in order of length.
static class Enumerator {
static class Node {
final Node prev;
final char prevCh;
final State state;
Node(State start) {
this(null, Character.MIN_VALUE, start);
}
Node(Node prev, char ch, State state) {
this.prev = prev;
this.prevCh = ch;
this.state = state;
}
}
final Deque<Node> queue = new ArrayDeque<>();
final List<String> output = new ArrayList<>();
final State start;
Enumerator(State start) {
this.start = start;
}
Enumerator enumerate(int outputLimit) {
queue.clear();
output.clear();
// Enqueue a search node for the start state.
queue.add(new Node(start));
while (!queue.isEmpty() && output.size() < outputLimit) {
Node current = queue.pollFirst();
if (current.state.accept) {
// Follow prev pointers to build the accepted string.
StringBuilder sb = new StringBuilder();
for (Node p = current; p.prev != null; p = p.prev) {
sb.append(p.prevCh);
}
output.add(sb.reverse().toString());
}
// Enqueue nodes for the successors of current state.
for (Entry<Character, State> transition : current.state.transitions.entrySet()) {
queue.addLast(new Node(current, transition.getKey(), transition.getValue()));
}
}
return this;
}
}
public static void main(String[] args) {
System.out.println(new Enumerator(s0).enumerate(20).output);
}
}
Output:
[b, c, ad, beb, bec, bfd, bead, bebeb, bebec, bebfd, bebead, bebebeb, bebebec, bebebfd, bebebead, bebebebeb, bebebebec, bebebebfd, bebebebead, bebebebebeb]
Upvotes: 1
Reputation: 241891
I'd do this with a breadth-first search over the DFA, which will produce strings ordered by length.
Define an object consisting of a state and a string (there is a more memory-efficient solution here, but I think that if you only need to produce 1000 strings, this will be fine). Then create a work-queue of such objects. Initialize the workqueue with a single object whose state is the start state and whose string is empty.
Now repeat the following three steps until you have found maxcount
strings or the queue becomes empty:
Remove the first object in the queue.
If its state is an accepting state, output its string.
For each possible out transition (which consists of a character and a new state), add to the end of the queue a new object with the transition's state and the concatenation of the object's string with the transition's character.
Upvotes: 1
Reputation: 60808
Alternately:
Upvotes: 1
Reputation: 37741
Note: I am guessing you can model this problem as finding all paths between start state and all accepting states. So, i believe this can be accomplished with a depth-first search of the graph. The depth-first search will find all non-cyclical paths between two nodes.
If you are familiar with Depth First Search (DFS) algorithm which is a graph search/traversal algorithm can be helpful in your problem. Let me give you a very simple example.
Given a directed graph, a source vertex ‘s’ and a destination vertex ‘d’, print all paths from given ‘s’ to ‘d’. Consider the following directed graph. Let the s be 2 and d be 3. There are 4 different paths from 2 to 3.
The idea is to do Depth First Traversal
of given directed graph. Start the traversal from source. Keep storing the visited vertices in an array say path[]
. If we reach the destination vertex, print contents of path[]
. The important thing is to mark current vertices in path[]
as visited also, so that the traversal doesn’t go in a cycle.
Sample Code: You can find a very simple code in Java
here to find all paths between two nodes in a graph.
Upvotes: 0