Bob
Bob

Reputation: 125

Regex Help Simple Pattern

I am very weak with Regex and need help. Input looks like the following:

<span> 10/28 &nbsp;&nbsp;Currency:&nbsp;USD

Desired output is 10/28.

I need to get all text between the <span> and "Currency:" that are numbers, a "/" character, or a ":" character. No spaces.

Can you help? Thanks.

Upvotes: 0

Views: 205

Answers (3)

Andomar
Andomar

Reputation: 238276

Try this regular expression:

<span>(?>.*?([\d/:]+)).*?Currency

The .*? matches the least amount of anything (non-greedy regex.) It should work for your example <span> 10/28 &nbsp;&nbsp;Currency:&nbsp;USD.

This is a nice site to test regular expressions.

Upvotes: 1

Rex M
Rex M

Reputation: 144172

Updated: What you're describing is three parts.

What we do want is one or more characters that are digits, forward slash, and :: [0-9/:]* (the asterisk means "zero or more instances"). Surrounded by:

  • <span>(optional stuff we don't want) is represented as: <span>[^0-9/:]*
  • (optional stuff we don't want)Currency is: [^0-9/:]*Currency

(The ^ means "not") - so this will essentially match any number of characters which is not the bits we want, including things like &nbsp;

In c#:

string pattern = @"<span>[^0-9/:]*(?<value>[0-9/:]*)[^0-9/:]*Currency";
Match match = Regex.Match(input, pattern, RegexOptions.SingleLine | RegexOptions.IgnoreCase);
string output = match.Groups["value"].Value;

Upvotes: 3

Robert Greiner
Robert Greiner

Reputation: 29742

Here's a good place to start. Using others code is fine at first, but if you don't learn this stuff you're going to be eternally doomed to asking questions every time you need a new regex.

Mastering Regular Expressions

Regular Expressions Cookbook

Online tutorial

spend some time, learn the basics, and pretty soon you'll be helping us with our regex problems.

Upvotes: 1

Related Questions