Reputation: 2297
So if I have simple regex such as:
"g{1,3}(a|e|i|o|u)"
I want my program to generate strings of
ga
ge
gi
go
gu
gga
gge
ggi
ggo
ggu
ggga
ggge
gggi
gggo
gggu
I would not use "g*(a|e|i|o|u)" for regex, as there can be infinite number of 'g's and there will be infinite number of strings.
Any recommendation on simple efficient algorithm to make this? I think I will be able to make these strings in a brute force way by using for/while loops, but I'm wondering if there is any methods I could use to make this algorithm work.
I googled how to create strings from regex and many people seemed to redirect to: https://code.google.com/p/xeger/ to use the library that is built, but I was wondering if I could get some suggestions to make my own for these simple regex.
Upvotes: 1
Views: 212
Reputation: 2759
I created Debuggex, which generates random strings to give you an idea of what a regex does.
If you already have a parse tree for your regex, you can use the following logic to generate random matches:
OrTree.random:
Choose a child randomly, return its random()
ConcatTree.random:
For every child, call random()
Return the concatenation of all the results
RepeatTree.random:
Choose a valid random number of repetitions within min and max
Call random() on your child that many times
Return the concatenation of all the results
Literal.random:
Return the literal
You can generate random strings even if you use the *
operator. This is done by choosing a distribution from 0 to infinity from which to generate numbers, just like you use a uniform distribution for finite sets. For example:
InfiniteRepeatTree.random:
Flip a coin until I get tails
Call random on child() the number of times that the coin landed heads
Return concatenation of the results
Hope that helps :)
Upvotes: 1
Reputation: 1373
char[] vowels = new char[] {'a','e','i','o','u'};
for (int i = 1; i <= 3; i++) {
for (int j = 0; j < vowels.length; j++) {
for (int k = 0; k < i; k++) {
System.out.print("g");
}
System.out.println(vowels[j]);
}
}
Not generic solution, but it works for your specific example
Upvotes: 0
Reputation: 2653
Xeger is open source. You could browse their code base for ideas.
EDIT:
Their code base looks very small, so shouldn't be too hard. It only generates random strings that will match, not all strings. It could still be a good starting point though.
Upvotes: 1