Justin Farrugia
Justin Farrugia

Reputation: 124

Separate words according to delimeter

I have a search box that allows both for searching of content in a table (space delimited) and for searching by a specific field in the table (colon delimited).

The only problem is that these can both exist at the same time. Examples:

  1. Type:Non-Fiction Murder
  2. Non ISBN:000000001
  3. Fiction ISBN:02 Plane

From example 1, Type is the field name, Non-Fiction is its content and Murder is the content in any field. I am looking for a Regex.Split that puts the field:result into a Dictionary and any other result into an array.

I have managed to make both work on a separate basis but not mixed:

var columnSearch_FieldNames = inSearch.ToUpper().Trim().Split(':').Where((x,i) => i % 2 == 0).ToArray();
var columnSearch_FieldContent = inSearch.ToUpper().Trim().Split(':').Where((x, i) => i % 2 != 0).ToArray();
var adhocSearch_FieldContent = inSearch.ToUpper().Trim().Split(' ');

Example 4:- Type:Non-Fiction Murder Non ISBN:000000001 Kill

Example Output:- Dictionary ({Type, Non-Fiction}, {ISBN, 0000001}) Array {Murder, Non, Kill}

Upvotes: 2

Views: 250

Answers (2)

Peter Duniho
Peter Duniho

Reputation: 70661

I don't see why using Regex would be faster. And IMHO, I don't think there's any improvement in the readability or maintainability of the code, using Regex. If anything, I think it would be more complicated. But if you really want to use Regex.Split(), something like this would work:

static void Main(string[] args)
{
    string input = "Type:Non-Fiction Murder Non ISBN:000000001 Kill", key = null, value = null;
    Dictionary<string, string> namedFields = new Dictionary<string, string>();
    List<string> anyField = new List<string>();
    Regex regex = new Regex("( )|(:)", RegexOptions.Compiled);

    foreach (string field in regex.Split(input))
    {
        switch (field)
        {
            case " ":
                _AddParameter(ref key, ref value, namedFields, anyField);
                break;
            case ":":
                key = value;
                break;
            default:
                value = field;
                break;
        }
    }
    _AddParameter(ref key, ref value, namedFields, anyField);
}

private static void _AddParameter(ref string key, ref string value, Dictionary<string, string> namedFields, List<string> anyField)
{
    if (key != null)
    {
        namedFields.Add(key, value);
        key = null;
    }
    else if (value != null)
    {
        anyField.Add(value);
        value = null;
    }
}

Now, if you're willing to just use a plain Regex match, instead of using the Regex.Split() method, one might argue this is marginally more readable/maintainable:

private static void UsingRegex(string input)
{
    Dictionary<string, string> namedFields = new Dictionary<string, string>();
    List<string> anyField = new List<string>();
    Regex regex = new Regex("(?:(?<key>[^ ]+):(?<value>[^ ]+))|(?<loneValue>[^ ]+)", RegexOptions.Compiled);

    foreach (Match match in regex.Matches(input))
    {
        string key = match.Groups["key"].Value,
            value = match.Groups["value"].Value,
            loneValue = match.Groups["loneValue"].Value;

        if (!string.IsNullOrEmpty(key))
        {
            namedFields.Add(key, value);
        }
        else
        {
            anyField.Add(loneValue);
        }
    }
}

Upvotes: 2

Keyur PATEL
Keyur PATEL

Reputation: 2329

If you're willing to forego Regex for a good old foreach loop combined with multiple splits, I think this achieves what you're looking for:

Dictionary<string, string> fields = new Dictionary<string, string>();
List<string> contents = new List<string>();

foreach (var word in main.Split(' '))     //main is a string, e.g. "Type:Non-Fiction Murder Non ISBN:000000001 Kill"
{
    var splitted = word.Split(':');
    if (splitted.Length == 2)
    {
        fields.Add(splitted[0], splitted[1]);
        continue;
    }
    contents.Add(word);
}

Basically splits words on space, then on colon to separate them.

If you really want an array of contents rather than List, simply do contents.ToArray().

Upvotes: 2

Related Questions