KyleMit
KyleMit

Reputation: 29947

Concatenate Split Array from Select Field in LINQ

I have a list of string elements, each of which can contain a comma separated list of values. From that, I want to generate a list of all the distinct values in the entire set.

Take the following set of strings for example:

Dim strings = New List(Of String) From {"A", "B,C,D", "D,E"}

I'd like to turn this into:

{"A", "B", "C", "D", "E"}

Using linq, I could take each element and convert it into a string array. The following query will just split each string into an array, but it will stay stuffed in it's own array element.

Dim fieldsLinq = (From s In strings
                  Select s.Split(",")) _
                 .Distinct()

I want to concatenate the all of the values into a single array of strings.

I could start by joining all the elements with a comma and THEN splitting the single string, but this feels like the wrong approach.

Dim fieldsJoin = String.Join(",", strings) _
                       .Split(",") _
                       .Distinct()

Are there any better solutions?

Upvotes: 1

Views: 1997

Answers (3)

Victor Zakharov
Victor Zakharov

Reputation: 26424

I know this may not have relevance to the original question, but it may still be useful for a user who came to this page searching for a good way to approach this problem. In my opinion, LINQ is supposed to make developer's life more simple, short and long term.

In this particular case, the alternative approach, i.e. without using LINQ, is very simple already, and much more maintainable. Typically, if you have input data like "A,B,D", you are approaching a maintenance nightmare in the long run, so risking your LINQ query to be extended to a big monster soon.

Consider this code, which does the same thing (old school kicks in):

Private Shared Function AggregateSplit(input As List(Of String)) _
                                                             As List(Of String)
  Dim hsItems As New HashSet(Of String)
  For Each s As String In input
    AddRangeToHashSet(hsItems, s.Split(","))
  Next
  Return hsItems.ToList
End Function

Private Shared Sub AddRangeToHashSet(hsItems As HashSet(Of String),
                                     items As IEnumerable(Of String))
  For Each s As String In items
    If hsItems.Contains(s) Then Continue For
    hsItems.Add(s)
  Next
End Sub

Usage:

Dim r = AggregateSplit(strings)

Ideally, you'd have AddRangeToHashSet as an extension method already, so you don't need to define it each and every time you need AddRange for one. So in this case it is just AggregateSplit that you would need to declare. Also much easier to explain what it's doing for a person who is not familiar with LINQ, and you can debug through, if something did not parse well.

Upvotes: 0

KyleMit
KyleMit

Reputation: 29947

I suppose the Aggregate could do the trick:

Dim fieldsAggregate = strings.Aggregate(New List(Of String), 
                                        Function(seed, s) 
                                            seed.AddRange(s.Split(","))
                                            Return seed
                                        End Function).Distinct()

Update: Here's the VB version of murdock's answer using SelectMany in lambda and query syntax:

Dim lambda = strings.SelectMany(Function(s) s.Split(",")) _ 
                    .Distinct()

Dim query = (From outer In strings
             From inner In outer.Split(",")
             Select inner) _
            .Distinct()

Upvotes: 1

Murdock
Murdock

Reputation: 4662

I don't know VB that well but you could just use SelectMany to flatten the arrays. in C#

strings.SelectMany(o=>o.Split(","));

Upvotes: 3

Related Questions