Chris
Chris

Reputation: 1505

Use regex to filter out words that don't meet a criteria

I have a text 0 to 10,000+ words long in a single string. I also have an input string input. How can I remove all words in the string that do not begin with input?

Ex:

"This is a string containing thirty-trillion thirsty thespians."

input = "th"

I'd like "This thirty thirsty thespians" returned. I have little knowledge of regex so I'm not sure how to approach this.

Upvotes: 0

Views: 357

Answers (1)

mr-
mr-

Reputation: 320

Here is a perl solution, I hope that is of some help.

$beginning = "th";
$s = "This is a string containing thirty-trillion thirsty thespians."; 
@results = $s =~/\b($beginning\w*)/ig;
print for @results`

and it will print

This thirty thirsty thespians

The regular expression does the following:
It starts its match with a word boundary, \b. ($beginning\w*) captures words that start with $beginning followed by zero or more word characters. The brackets mean that it is to return whatever was matched inside of the brackets.

The i means that it is not case sensitive the g means that it goes through the whole string and returns everything that matches as a list (@results here).

Upvotes: 2

Related Questions