BahaBulle
BahaBulle

Reputation: 15

Extract part of a text according to one character or another

I'm trying to write a regex to get some part of a string depending to a character or another.

Here, all input I can have (size of each part is not fix-length):

1. aaaabbccc
2. aaaa-bbbb
3. bbbb/cccc
4. aaaa-bbbb/cccc

Rules are:

I tried conditional regex like this: (?(?=.*-.*)-[^\/]*|^[^\/]*)

It almost work but the '-' is in the result and I don't want it.

Anybody have an idea?

Upvotes: 1

Views: 81

Answers (3)

Drag and Drop
Drag and Drop

Reputation: 2734

You have 3 kind of separator ' ', '-', '/'.
You can split on them and take the part you want based on the index.

"1. aaaabbccc"      => "1", "aaaabbccc"
                        0      1

"2. aaaa-bbbb"      => "2.", "aaaa", "bbbb"
                        0      1       2

"3. bbbb/cccc"      => "3.", "bbbb", "cccc"
                        0      1       2

"4. aaaa-bbbb/cccc" => "4.", "aaaa", "bbbb", "cccc"
                        0      1       2       3

for example for input #4, the "bbbb" part will be index 2

static string Spliter(string input)
{
    var rules = new   (char[] separators, int targetIndex)[] {
            (new []{' ', '-', '/' }, 2),
            (new []{' ', '/'}, 1),
            (new []{' ', '-'}, 2),
            (new []{' '}, 1),
        };

    foreach (var rule in rules) {
        if (rule.separators.Any() && rule.separators.All(separator => input.Contains(separator)))
            return input.Split(rule.separators)[rule.targetIndex];
    }

    return input;// default match no rules, return the input
} 

result :

1. aaaabbccc        --> aaaabbccc
2. aaaa-bbbb        --> bbbb
3. bbbb/cccc        --> bbbb
4. aaaa-bbbb/cccc   --> bbbb

Demo : https://dotnetfiddle.net/dYxgN9

Imagine "2. " was never a part of your input. You simply drop the ' ' separtor and change the target index accordingly.

Upvotes: 0

Flydog57
Flydog57

Reputation: 7111

As others have mentioned, skipping the complexity of Regex and simply translating your specification into C# may be a viable simple solution (and, it should be faster). For example, taking your sample strings:

private static readonly string[] _testStrings =
{
    "aaaabbccc",
    "aaaa-bbbb",
    "bbbb/cccc",
    "aaaa-bbbb/cccc",
};

and running them through this code (which mimics your spec, Updated after the comment by @DragAndDrop):

foreach (var s in _testStrings)
{
    string result;
    var slashIndex = s.IndexOf("/");
    var hyphenIndex = s.IndexOf("-");
    if (slashIndex >= 0)
    {
        if (hyphenIndex >= 0)
        {
            result = s.Substring(hyphenIndex + 1, slashIndex - hyphenIndex - 1);
        }
        else
        {
            result = s.Substring(0, slashIndex);
        }
    } 
    else if (hyphenIndex >= 0)
    {
        result = s.Substring(hyphenIndex + 1);
    }
    else
    {
        result = s;
    }
    Debug.WriteLine(result);
}

results in:

aaaabbccc
bbbb
bbbb
bbbb

There's more code, but it's easier to read. I'm also pretty sure that it will be faster to execute.

Upvotes: 1

Ashkan Mobayen Khiabani
Ashkan Mobayen Khiabani

Reputation: 34170

Here is my answer without Regex that should work way faster than the regex:

var dash = str.IndexOf("-");
var slash = str.IndexOf("/");
var result = dash != -1 ? 
(slash != -1 ? str.Substring(Math.Min(dash, slash) + 1, Math.Max(dash, slash) - Math.Min(dash, slash) -1) : str.Substring(dash + 1)):
(slash != -1 ? str.Substring(0, slash) : str);  

Here is a live DEMO

Here is the result from demo:

aaaabbccc
bbbb
bbbb
bbbb

Upvotes: 1

Related Questions