Walker
Walker

Reputation: 5189

VB.Net Regular Expressions - Extracting Wildcard Value

I need help extracting the value of a wildcard from a Regular Expressions match. For example:

Regex: "I like *"

Input: "I like chocolate"

I would like to be able to extract the string "chocolate" from the Regex match (or whatever else is there). If possible, I also want to be able to retrieve several wildcard values from a single wildcard match. For example:

Regex: "I play the * and the *"

Input: "I play the guitar and the bass"

I want to be able to extract both "guitar" and "bass". Is there a way to do it?

Upvotes: 1

Views: 6227

Answers (2)

Gaijinhunter
Gaijinhunter

Reputation: 14685

Here is my RegexExtract Function in VBA. It will return just the sub match you specify (only the stuff in parenthesis). So in your case, you'd write:

 =RegexExtract(A1, "I like (.*)")

Here is the code.

Function RegexExtract(ByVal text As String, _
                      ByVal extract_what As String) As String

Application.ScreenUpdating = False
Dim allMatches As Object
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")

RE.Pattern = extract_what
RE.Global = True
Set allMatches = RE.Execute(text)
RegexExtract = allMatches.Item(0).submatches.Item(0)
Application.ScreenUpdating = True

End Function

Here is a version that will allow you to use multiple groups to extract multiple parts at once:

Function RegexExtract(ByVal text As String, _
                      ByVal extract_what As String) As String

Application.ScreenUpdating = False
Dim allMatches As Object
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
Dim i As Long
Dim result As String

RE.Pattern = extract_what
RE.Global = True
Set allMatches = RE.Execute(text)

For i = 0 To allMatches.Item(0).submatches.count - 1
    result = result & allMatches.Item(0).submatches.Item(i)
Next

RegexExtract = result
Application.ScreenUpdating = True

End Function

Upvotes: 0

rerun
rerun

Reputation: 25495

In general regex utilize the concepts of groups. Groups are indicated by parenthesis.

So I like
Would be I like (.
) . = All character * meaning as many or none of the preceding character

Sub Main()
    Dim s As String = "I Like hats"
    Dim rxstr As String = "I Like(.*)"
    Dim m As Match = Regex.Match(s, rxstr)
    Console.WriteLine(m.Groups(1))

End Sub

The above code will work for and string that has I Like and will print out all characters after including the ' ' as . matches even white space.

Your second case is more interesting because the first rx will match the entire end of the string you need something more restrictive.

I Like (\w+) and (\w+) : this will match I Like then a space and one or more word characters and then an and a space and one or more word characters

Sub Main()

    Dim s2 As String = "I Like hats and dogs"
    Dim rxstr2 As String = "I Like (\w+) and (\w+)"
    Dim m As Match = Regex.Match(s2, rxstr2)
    Console.WriteLine("{0} : {1}", m.Groups(1), m.Groups(2))
End Sub

For a more complete treatment of regex take a look at this site which has a great tutorial.

Upvotes: 4

Related Questions