Atrotygma
Atrotygma

Reputation: 1153

LINQ approach to parse lines with keys/values

I have the following string

 MyKey1=MyVal1
 MyKey2=MyVal2
 MyKey3=MyVal3
 MyKey3=MyVal3

So first, in need to split into lines, then I need to split each line by '=' char to get key and value from that line. What I want, as a result, is a List<KeyValuePair<string, string>> (why not a Dictionary? => there may be duplicate keys inside the list), so I can't use the .ToDictionary() extension.

I'm pretty stuck with the following:

List<KeyValuePair<string, string>> fields =
    (from lines in Regex.Split(input, @"\r?\n|\r", RegexOptions.None)
    where !String.IsNullOrWhiteSpace(lines)
    .Select(x => x.Split(new [] { '='}, 2, StringSplitOptions.RemoveEmptyEntries))
    .ToList()

    --> select new KeyValuePair? Or with 'let' for splitting by '='?
        what about exception handling (e.g. ignoring empty values)

Upvotes: 1

Views: 1023

Answers (4)

Simon Belanger
Simon Belanger

Reputation: 14870

I suggest you try matching the Key/Value instead of splitting. If you want a dictionary with multiple values for a key, you could use ToLookup (an ILookup):

var result = Regex.Matches(input, @"(?<key>[^=\r\n]+)=(?<value>[^=\r\n]+)")
                  .OfType<Match>()
                  .ToLookup(m => m.Groups["key"].Value, 
                            m => m.Groups["value"].Value);

If you need to add to that list later on or want to keep using a list:

var result = Regex.Matches(input, @"(?<key>[^=\r\n]+)=(?<value>[^=\r\n]+)")
                  .OfType<Match>()
                  .Select(m => new KeyValuePair<string, string>(m.Groups["key"].Value, m.Groups["value"].Value))
                    .ToList();

Note: the Regex used might not be suited for your uses as we don't know the inputs you might have.

Upvotes: 0

Tim Schmelter
Tim Schmelter

Reputation: 460158

You could use a Lookup<TKey, TValue> instead of a dictionary:

var keyValLookup = text.Split(new[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries)
    .Select(l =>
    {
        var keyVal = l.Split('=');
        return new { Key = keyVal[0].Trim(), Value = keyVal.ElementAtOrDefault(1) };
    })
    .Where(x => x.Key.Length > 0)  // not required, just to show how to handle invalid data
    .ToLookup(x => x.Key, x => x.Value);

IEnumerable<string> values = keyValLookup["MyKey3"];
Console.Write(string.Join(", ",values)); // MyVal3, MyVal3

A lookup always returns a value even if the key is not present. Then it's an empty sequence. The key must not be unique, so you don't need to group by or remove duplicates before you use ToLookup.

Upvotes: 2

D Stanley
D Stanley

Reputation: 152566

You're pretty close (I changed your example to all method syntax for consistency):

List<KeyValuePair<string, string>> fields =
    Regex.Split(input, @"\r?\n|\r", RegexOptions.None)
    .Where(s => !String.IsNullOrWhiteSpace(s))
    .Select(x => x.Split(new [] {'='}, 2, StringSplitOptions.RemoveEmptyEntries)
    .Where(p => p.Length == 2)  // to avoid IndexOutOfRangeException
    .Select(p => new KeyValuePair(p[0], p[1]));

Although I agree with Jon's comment that a grouping would be cleaner if you have duplicate keys:

IEnumerable<IGrouping<string, string>> fields =
    Regex.Split(input, @"\r?\n|\r", RegexOptions.None)
    .Where(s => !String.IsNullOrWhiteSpace(s))
    .Select(x => x.Split(new [] {'='}, 2, StringSplitOptions.RemoveEmptyEntries))
    .GroupBy(p => p[0]);

Upvotes: 1

p.s.w.g
p.s.w.g

Reputation: 149030

If you're concerned about duplicate keys, you could use an ILookup instead:

var fields =
    (from line in Regex.Split(input, @"\r?\n|\r", RegexOptions.None)
     select line.Split(new [] { '=' }, 2))
    .ToLookup(x => x[0], x => x[1]);

var items = fields["MyKey3"]; // [ "MyVal3", "MyVal3" ]

Upvotes: 2

Related Questions