ElektroStudios
ElektroStudios

Reputation: 20464

RegEx is not doing what I want

I want to retreive a part of a string, my question is if I can get the correct string without touching the matched string (substring or splitting), I know is easy to do a split/substring but I want to know if I can improve the RegEx to do the job itself.

This is the String:

<h3 class="btl"><a href="http://post-hardcore.ru/music/2689-drifter-in-search-of-something-more-ep-2013.html">Drifter - In Search of Something More [EP] (2013)</a></h3>

This is what I get:

>Drifter - In Search of Something More [EP] (2013)</a><

This is what I want to get:

Drifter - In Search of Something More [EP] (2013)

This is my RegEx:

Dim RegEx_AlbumName As New Regex(">[^<].*<")

MsgBox(RegEx_AlbumName.Match(Line).Groups(0).ToString)

I don't want to do this:

AlbumName = RegEx_AlbumName.Match(Line).Groups(0).ToString.Substring(1).Replace("</a><", "")

EDIT: Note that the parentheses word "(2013)" may not be in other strings that I need to match.

Upvotes: 0

Views: 59

Answers (1)

ilomambo
ilomambo

Reputation: 8350

Your regex was close. The [^<] worked only for the first character after > the .* eat up any character until the last <, because it is greedy. Try this:

Dim RegEx_AlbumName As New Regex(">([^<]+?)<")

MsgBox(RegEx_AlbumName.Match(Line).Groups(1).ToString)

The ? tell the regex to match the shortest possible string. Only in this case I think it will work without it too.

Upvotes: 1

Related Questions