Use .Contains() to match on a property of a property of <T> in LINQ query

Looking for help on how to perform a LINQ query using the .Contains() method of a List(Of T) to get back elements that are not contained in a second List(Of T) based on a property of a property of T in the first List(Of T).

Here is some sample code that I wrote up, this scenario is ficticious, but the concept is still there.

Module Module1

    Sub Main()
        ' Get all Files in a directory that contain `.mp` in the name
        Dim AllFiles As List(Of IO.FileInfo) = New IO.DirectoryInfo("C:\Test\Path").GetFiles("*.mp*").ToList
        Dim ValidFiles As New List(Of fileStruct)

        ' Get all Files that actually have an extension of `.mp3`
        AllFiles.ForEach(Sub(x) If x.Extension.Contains("mp3") Then ValidFiles.Add(New fileStruct(prop1:=x.Name, path:=x.FullName)))

        ' Attempting the get all files that are not listed in the Valid files list
        Dim InvalidFiles As IO.FileInfo() = From file As IO.FileInfo In AllFiles Where Not ValidFiles.Contains(Function(x As fileStruct) x.fleInfo.FullName = file.FullName) Select file
        ' Errors on the `.Contains()` method because I have no idea what I'm doing and I am basically guessing at this point

        'Here is the same but instead using the `.Any()` Method
        Dim InvalidFiles As IO.FileInfo() = From file As IO.FileInfo In AllFiles Where Not ValidFiles.Any(Function(x As fileStruct) x.fleInfo.FullName = file.FullName) Select file
        ' This doesn't error out, but all files are returned
    End Sub

    Public Structure fileStruct
        Private _filePath As String
        Private _property1 As String

        Public ReadOnly Property property1 As String
            Get
                Return _property1
            End Get
        End Property

        Public ReadOnly Property fleInfo As IO.FileInfo
            Get
                Return New IO.FileInfo(_filePath)
            End Get
        End Property

        Public Sub New(ByVal prop1 As String, ByVal path As String)
            _property1 = prop1
            _filePath = path
        End Sub
    End Structure
End Module

Upvotes: 1

Views: 180

Answers (3)

This is a more or less direct implementation of the MP3 files list in the question. I did use a FileItem class instead of a structure. The good part is afterwards:

' note: EnumerateFiles
Dim AllFiles As List(Of IO.FileInfo) = New IO.DirectoryInfo("M:\Music").
    EnumerateFiles("*.mp*", IO.SearchOption.AllDirectories).ToList()

Dim goofyFilter As String() = {"g", "h", "s", "a"}

' filter All files to those starting with the above (lots of
' Aerosmith, Steely Dan and Heart)
Dim ValidFiles As List(Of FileItem) = AllFiles.
                Where(Function(w) goofyFilter.Contains((w.Name.ToLower)(0))).
                Select(Function(s) New FileItem(s.FullName)).ToList()

Dim invalid As List(Of FileInfo)

invalid = AllFiles.Where(Function(w) Not ValidFiles.
                        Any(Function(a) w.FullName = a.FilePath)).ToList()

This is much the same as Sam's answer except with your file/mp3 usage. AllFiles has 809 items, ValidFiles has 274. The resulting invalid list is 535.


Now, lets speed it up 50-60x:

Same starting code for AllFiles and ValidFiles:

Dim FileItemValid = Function(s As String)
                        Dim valid As Boolean = False
                        For Each fi As FileItem In ValidFiles
                            If fi.FilePath = s Then
                                valid = True
                                Exit For
                            End If
                        Next
                        Return valid
                    End Function

invalid = AllFiles.Where(Function(w) FileItemValid(w.FullName) = False).ToList()

With a Stopwatch, the results are:

    Where/Any count: 535, time: 572ms  
FileItemValid count: 535, time: 9ms

You get similar results with a plain old For/Each loop that calls an IsValid function.


If you do not need other FileInfo, you could create your AllFiles as a list of the same structure as you are receiving so you can do property vs property compares, use Except and Contains:

AllFiles2 = Directory.EnumerateFiles("M:\Music", "*.mp3", IO.SearchOption.AllDirectories).
            Select(Function(s) New FileItem(s)).ToList()

Now you can use Contains with middling results:

invalid2 = AllFiles2.Where(Function(w) Not ValidFiles.Contains(w)).ToList()

This also allows you to use Except which is simpler and faster:

invalid2 = AllFiles2.Except(ValidFiles).ToList()
 Where/Contains count: 535, time: 74ms  
         Except count: 535, time: 3ms

Even if you need other items from FileInfo, you can easily fetch them given the filename

Upvotes: 1

Daniel Bişar
Daniel Bişar

Reputation: 2763

Simply use Except as CraigW suggested. You have to do some projections (select) to get it done.

Dim InvalidFiles as IO.FileInfo() = AllFiles.Select(Function(p) p.FullName).Except(ValidFiles.Select(Function(x) x.fleInfo.FullName)).Select(Function(fullName) New IO.FileInfo(fullName)).ToArray()

Note: This code is not really efficient and also not very readable but works.

But i would go for something like this:

Dim AllFiles As List(Of IO.FileInfo) = New IO.DirectoryInfo("C:\MyFiles").GetFiles("*.mp*").ToList
Dim ValidFiles As New List(Of fileStruct)
Dim InvalidFiles as New List(Of FileInfo)

For Each fileInfo As FileInfo In AllFiles
    If fileInfo.Extension.Contains("mp3") Then 
        ValidFiles.Add(New fileStruct(prop1:=fileInfo.Name, path:=fileInfo.FullName))
    Else 
        InvalidFiles.Add(fileInfo)
    End If
Next

Simple, fast and readable.

Upvotes: 0

user3230660
user3230660

Reputation:

As others have noted, .Except() is a better approach but here is an answer to your question:

List<int> list1 = new List<int> { 1, 2, 3 };

List<int> list2 = new List<int> { 3, 4, 5 };

List<int> list3 = list1.Where(list1value => !list2.Contains(list1value)).ToList();  // 1, 2

Based on comments here as an example using different types. This query use .Any()

List<Product> list1 = new List<Produc> { ... };

List<Vendor> list2 = new List<Vendor> { ... };

List<Product> list3 = list1.Where(product => !list2.Any(vendor => product.VendorID == vendor.ID)).ToList();  


// list3 will contain products with a vendorID that does not match the ID of any vendor in list2.

Upvotes: 0

Related Questions