R2-D2
R2-D2

Reputation: 143

C# Regular Expressions on string

So I'm trying to find a string within a string, now I currently have this constructed in,

string str4;

var str5 = "   type: '");
foreach (string str6 in response.Split(new char[] { '\n' }))
{
    if (str6.StartsWith(str5))
    {
        str4 = str6.Replace(str5, "").Replace(" ", "").Replace("',", "");
        break;
    }
}

Which works as expected & will grab the text from

type: '

Example of this is

type: ' EXAMPLE ',

Ouput after loop

EXAMPLE

Now the issue is that occasionally the spaces at the start of ' type: ' vary, so sometimes it may be equal to the spaces I provided, and other times it may not..

I was trying to use Regex so I could do something such as,

string str5 = "Regex(*)type: '"

Now of course that's completely incorrect in terms of usage, but my example shows the use of * which would be equal to any possibilities, so therefore no matter on the number of spaces, I would still be able to extract the innertext from type.

Upvotes: 2

Views: 397

Answers (4)

Mark Cilia Vincenti
Mark Cilia Vincenti

Reputation: 1614

You can use .Trim(), .TrimStart() and .TrimEnd(). Using Regex looks like extra overhead which you don't really need.

Upvotes: 1

Display name
Display name

Reputation: 1542

First, if you are going to use Regex try out your regex strings here: https://regex101.com/

Second, If you can avoid RegEx, I would advise you to so. If a developer uses regex to solve a problem, now he has two problems. Regex can be tricky if you haven't worked with a lot. Having said that, here's another regex based solution. Also, there're usually several ways to construct a RegEx string.

using System;
using System.Text.RegularExpressions;

namespace ConsoleApp
{
    class Program
    {
        static void Main(string[] args)
        {
            string[] SampleInputList = new string[]
            {
                "type:EXAMPLE",
                " type: EXAMPLE ",
                "   type:  EXAMPLE  "
            };

            // The following is a non-capture group indicator: (?:)
            // non-capture groups are a good way to organize parts
            // of your regex string and can help you visualize the
            // parts that are just markers like 'type:' vs. the parts
            // that you want to actually manipulate in the code.
            Regex expression = new Regex(@"(?:\s*type\:\s*)([A-Za-z]*)");

            foreach (string Sample in SampleInputList)
            {
                MatchCollection matches = expression.Matches(Sample);
                if (matches.Count > 0)
                {
                    GroupCollection groups = matches[0].Groups;
                    if (groups.Count > 1)
                    {
                        Console.WriteLine(groups[1].Value);
                    }
                }
            }
        }
    }
}

Upvotes: 0

StriplingWarrior
StriplingWarrior

Reputation: 156469

If this is just a simple extraction task with very limited variation in inputs, you can do it with plain Regex:

var response = @"[{
   type: 'foo',
something: 'bar',
},
{
  type: 'baz',
  something: 'bat'
}]";
var types = Regex.Matches(response, @"\s*type\:\s*\'(.+)\'")
    .Cast<Match>()
    .Select(m => m.Groups.Cast<Group>().Skip(1).Single().Value);

But it sounds like you might be trying to write a parser for a programming or markup language. If so, I would highly recommend that you don't try to do that with Regex. Regular Expressions get really hairy the moment you start trying to handle things like escaped strings ("type: 'I\'m a type: '").

If your input is in a standard format like JSON, use a parsing library for that format. If not, there are libraries like Sprache which make it easy to create powerful custom parsers.

Upvotes: 0

Emma
Emma

Reputation: 27723

Here we would simply add optional spaces before and after our desired outputs, e.g., Example, and we can start with this expression, for instance:

type:(\s+)?'(\s+)?(.+?)(\s+)?',

Demo

or:

type:(\s+)?'(\s+)?(.+?)(\s+)?'

if we might have types of ', we would expand our expression to:

type:(\s+)?['"](\s+)?(.+?)(\s+)?['"]

Test

using System;
using System.Text.RegularExpressions;

public class Example
{
    public static void Main()
    {
        string pattern = @"type:(\s+)?'(\s+)?(.+?)(\s+)?',";
        string input = @"type:' EXAMPLE ',
type: ' EXAMPLE ',
type:    '   EXAMPLE    ',
type:    '   Any other EXAMPLE we might have   ',";
        RegexOptions options = RegexOptions.Multiline;

        foreach (Match m in Regex.Matches(input, pattern, options))
        {
            Console.WriteLine("'{0}' found at index {1}.", m.Value, m.Index);
        }
    }
}

RegEx Circuit

jex.im visualizes regular expressions:

enter image description here

Upvotes: 3

Related Questions