Xss
Xss

Reputation: 221

C# LINQ join with conditional where clause on two different data sets

I have two collections that I am comparing data against. I can join the two collections by ids. I need to have a where clause that returns a list of data where certain items in collection B were not found in collection A

    public class Contract
    {
        public string ContractId { get; set; }
        public IList InvoiceList { get; set; }

        public class Invoice
        {
            public int InvoiceNumber { get; set; }
        } 
    }

    public class PaymentRequest
    {
        public IList Contracts { get; set; }
        public class ContractList
        {       
            public string ContractId { get; set; }  
            public IList Invoices { get; set; }
        }

        public class InvoiceList
        {          
            public int InvoiceNumber { get; set; }
        }
    }

I have the following so far but can't quite get the where clause down.

    var query = (
    from request in (
        from contract in paymentRequest.Contracts                    
        from invoice in contract.Invoices
        select new { contract, invoice })
    join valid in (
        from contract in validContracts
        select new { contract })               
    on new { request.contract.ContractId } equals new { valid.contract.ContractId }
    where !(
        // get a list of invoice numbers in the request data set that are not in the valid data set
    )
    select "Contract Id: " + request.contract.ContractId +
           ", Invoice Number: " + request.invoice.InvoiceNumber
).ToList();

Any help is appreciated.

Upvotes: 2

Views: 1451

Answers (3)

Enigmativity
Enigmativity

Reputation: 117010

It seems to me that your query should look like this:

var query =
(
    from request in
    (
        from contract in paymentRequest.Contracts                    
        from invoice in contract.Invoices
        select new { contract, invoice }
    )
    join valid in
    (
        from contract in validContracts
        select new { contract }
    ) on new { request.contract.ContractId } equals new { valid.contract.ContractId } into gvs
    where
        gvs
            .SelectMany(gv => gv.contract.Invoices)
            .Select(i => i.InvoiceNumber)
            .All(n => n != request.invoice.InvoiceNumber)
    select "Contract Id: " + request.contract.ContractId +
           ", Invoice Number: " + request.invoice.InvoiceNumber
).ToList();

Upvotes: 0

T.S.
T.S.

Reputation: 19330

Using contains gets you less performance than something that reminds a regular Sql query. This will be better

var result = 
    from itm1 in Coll1
    from itm2 in Coll2.Where(x => x.Id == itm1.Id).DefaultIfEmpty()
    where itm2 == null
    select itm1;

This will give you all items in Coll1 that don't exist in Coll2. And this is gonna be faster than using Contains any time

Upvotes: 0

Raja Nadar
Raja Nadar

Reputation: 9489

var collectionAIds = new HashSet<int>(collectionA.Select(colAItem => colAItem.Id));
var itemsInCollectionBNotInCollectionA = collectionB.Where(colBItem => 
                                             !collectionAIds.Contains(colBItem.Id));

basically, we want to get the Ids in collection A, and then select all items from collection B, not in A's list of ids.

A HashSet is optional. it just avoids repetitive O(n) lookups, if you don't use that variable.

p.s. i am assuming int, as the type of id.. use the data type of id for the Hashset.

Upvotes: 1

Related Questions