Reputation: 125
I am very weak with Regex and need help. Input looks like the following:
<span> 10/28 Currency: USD
Desired output is 10/28
.
I need to get all text between the <span>
and "Currency:" that are numbers, a "/" character, or a ":" character. No spaces.
Can you help? Thanks.
Upvotes: 0
Views: 205
Reputation: 238276
Try this regular expression:
<span>(?>.*?([\d/:]+)).*?Currency
The .*?
matches the least amount of anything (non-greedy regex.) It should work for your example <span> 10/28 Currency: USD
.
This is a nice site to test regular expressions.
Upvotes: 1
Reputation: 144172
Updated: What you're describing is three parts.
What we do want is one or more characters that are digits, forward slash, and :
: [0-9/:]*
(the asterisk means "zero or more instances"). Surrounded by:
<span>(optional stuff we don't want)
is represented as: <span>[^0-9/:]*
(optional stuff we don't want)Currency
is: [^0-9/:]*Currency
(The ^
means "not") - so this will essentially match any number of characters which is not the bits we want, including things like
In c#:
string pattern = @"<span>[^0-9/:]*(?<value>[0-9/:]*)[^0-9/:]*Currency";
Match match = Regex.Match(input, pattern, RegexOptions.SingleLine | RegexOptions.IgnoreCase);
string output = match.Groups["value"].Value;
Upvotes: 3
Reputation: 29742
Here's a good place to start. Using others code is fine at first, but if you don't learn this stuff you're going to be eternally doomed to asking questions every time you need a new regex.
spend some time, learn the basics, and pretty soon you'll be helping us with our regex problems.
Upvotes: 1