dban
dban

Reputation:

C# .net Regular Expression to match text delimited by spaces and colon

I have three sentences as follows:

000000-00000 Date First text: something1 
200000-00000 Time Second text: something2
234222-34332 struc Third text: somthing3

How do I write a regex to match between (Date|Time|struc) and the colon (:), not including (Date|Time|struc)?.

Upvotes: 0

Views: 1036

Answers (4)

BenAlabaster
BenAlabaster

Reputation: 39846

If from you're example you're expecting the output to be:

First text Second text Third text

You would use the regular expression

(?<=(DATE|TIME|STRUC)\s)[^:]*

I can't imagine looking at your example that would be extremely useful though - it looks like the descriptive text is after the colon which would imply that you really want everything to the end of the line which would be:

(?i:(?<=(DATE|TIME|STRUC)\s).*)

[Checked using RegexBuddy - so if I interpreted your question correctly, this works]

Upvotes: 0

Tomalak
Tomalak

Reputation: 338316

This:

^\d{6}-\d{5} \S+ ([^:]+)

Would match "First text", "Second text" and "Third text", without explicitly referring to (Date|Time|struc). The match is in group 1.

Upvotes: 0

Jon Skeet
Jon Skeet

Reputation: 1502216

I suspect this is what you're after. The regex part is:

new Regex(@"^\d{6}-\d{5} \w* ([^:]*): ")

And here's a short but complete test program:

using System;
using System.Text.RegularExpressions;

class Test
{   
    static void Main(string[] args)
    {
        Parse("000000-00000 Date First text: something1");
        Parse("200000-00000 Time Second text: something2");
        Parse("234222-34332 struc Third text: somthing3");
    }

    static readonly Regex Pattern = new Regex
        (@"^\d{6}-\d{5} \w* ([^:]*): ");

    static void Parse(string text)
    {
        Console.WriteLine("Input: {0}", text);
        Match match = Pattern.Match(text);
        if (!match.Success)
        {
            Console.WriteLine("No match!");
        }
        else
        {
            Console.WriteLine("Middle bit: {0}", match.Groups[1]);
        }
    }
}

Note that this doesn't assume "Date", "Time" "struc" are the only possible values after the digits, just that they'll be constructed from word characters. It also assumes you want to match against the whole line, not just the middle part. It's easy to extract the other sections with other groups if that would be helpful to you.

Upvotes: 3

Daniel Br&#252;ckner
Daniel Br&#252;ckner

Reputation: 59675

The following expression will capture what you want into the named group value excluding Date, Time, struc, the following space, and the colon following the value.

(?:Date|Time|struc) (?<value>[^:]*)

This expression will include the colon.

(?:Date|Time|struc) (?<value>[^:]*:)

Upvotes: 0

Related Questions