Reputation: 14615
I am trying to use a regex in VB.NET - the language probably shouldn't matter though - I am trying to extract something reasonable out of a very large file name, "\\path\path\path.path.path\path\some_more_stuff_from a name.item_123_456.html
"
I would like to extract, from that whole mess, the "item_123_456
"
It seems to make sense that I can get everything before a pattern like ".html" , and from it, everything after the last dot ?
I have tried to get at least the last part (the entire string before .html) and I still get no matches:
Dim matches As MatchCollection
Dim regexStuff As New Regex(".*\\.html")
matches = regexStuff.Matches(strINeed)
Dim successfulMatch As Match
For Each successfulMatch In matches
strFound = successfulMatch.Value
Next
The match I experimented with, hoping I might even get everything between a dot and an .html
: Regex("\\..*\\.html")
returned Nothing as well.
I just can't get regular expressions to work...
Upvotes: 0
Views: 463
Reputation:
It could probably be generalized into this
[^.\\]+\.html
Edit: or, initial dot required
\.[^.\\]+\.html
Upvotes: 1
Reputation: 2226
.*\.(.*?)\.html
This finds as many characters as possible .*
until it comes to ( a dot followed by as few characters as possible followed by a dot html ) (\.(.*?)\.html
)
It places the stuff between the dot html and the dot preceding the dot html into a capturing group, which should be in $1. If you need the vb.net code for that I can likely get that as well, but your code looked okay
Your vb code should look something like this:
Dim matches As MatchCollection
Dim regexStuff As New Regex(".*\.(.*?)\.html")
matches = regexStuff.Matches(strINeed)
strFound = matches.Item(0).Groups(1).Value.ToString
Upvotes: 1