jasoares
jasoares

Reputation: 1821

Match word and any amount of first sequential characters of the word

Is there a simpler way to write the following regular expression, specifically avoiding all the groupings with the '?' optional character?

/^w(o(r(d)?)?)?$/

It should match the following:

and should not match, as mere examples:

In this particular case its a very short word but you can see by this next example how things can become ugly very fast.

Regex to match vertical or horizontal and any amount of first sequential characters of each word:

/^h(o(r(i(z(o(n(t(a(l)?)?)?)?)?)?)?)?)?|v(e(r(t(i(c(a(l)?)?)?)?)?)?)?)$/

I'm using ruby but I think this question applies to any language that makes use of regular expressions, so I'll thank answers in any language. Don't know much about perl, though...

I only found one question similar to mine but doesn't show any better solution, anyway, here is the link.

Upvotes: 0

Views: 381

Answers (3)

Bohemian
Bohemian

Reputation: 425003

You could simplify it with an OR expression:

/^(w|wo|wor|word)$/

or reverse the test by making a regex from the input text (in pseudo code):

"word" matches /input + ".*"/

Upvotes: 3

inhan
inhan

Reputation: 7470

Although ugly and harder to read, I would create a function to create the regex for each word. If it were PHP, for example, I would formulize it like the following:

function rx_from_word($word='',$escapeNeeded=true) {
    $rx = ''; $i = strlen($word);
    while (--$i > -1) {
        if ($escapeNeeded && strpos('|/{}[]().*\\+^$',$word{$i}) !== false) $char = '\\'.$word{$i};
        // I'm not sure if I missed any special character above.
        else $char = $word{$i};
        if ($i > 0) $rx = '(' . $char . $rx . ')?';
        else $rx = $char . $rx;
    }
    return $rx;
}

function rx_from_words($words=array(),$matchFull=false) {
    $rx = $matchFull ? '^' : '';
    foreach ($words as $word) $rx .= rx_from_word($word) . '|';
    return substr($rx,0,-1) . ($matchFull ? '$' : '');
}

$words = array('horizontal','vertical','$10');
$rx = rx_from_words($words,1);
echo "<pre>$rx</pre>";

which would output

^h(o(r(i(z(o(n(t(a(l)?)?)?)?)?)?)?)?)?|v(e(r(t(i(c(a(l)?)?)?)?)?)?)?|\$(1(0)?)?$

Upvotes: 0

lxop
lxop

Reputation: 8595

What if you did it a different way? For example (I'm not familiar with ruby, so I'll use python):

s = "hor"

if "horizontal".startswith (s):
    h = True
if "vertical".startswith (s):
    v = True

Or something along those lines

Upvotes: 1

Related Questions