nitinvertigo
nitinvertigo

Reputation: 1180

Filter two lists on one property c# using linq

I have two objects namely Card and Transaction:

Card:
public string CardID {get; set;}
public string TransactionRef {get; set;}

Transaction:
public string TxnID {get; set;}
public string TxnDetails {get; set;}

Note: The TransactionRef is of the format Date|TxnID

I also have a list of the two objects List<Card> cardDetails and List<Transaction> transDetails

cardDetails:
{CardID = '1', TransactionRef = '20150824|Guid1'}
{CardID = '2', TransactionRef = '20150824|Guid2'}
{CardID = '3', TransactionRef = '20150824|Guid3'}

transDetails:
{TxnID = '23', TxnDetails = 'Guid1'}
{TxnID = '24', TxnDetails = 'Guid2'}

I want to filter cardDetails using transDetails based on TxnDetails so that it filters out the items which do not contain the TxnDetails from the 2nd list.

This should be the output:

cardDetails:
 {CardID = '3', TransactionRef = '20150824|Guid3'}

I have tried like this using linq:

  cardDetails = cardDetails.Where(x => transDetails.Any(y => x.TransactionRef.Contains(y.TxnDetails) == false)).ToList();

but it always returns the list as blank. I have tried many variants of this query without success. I know this question has been asked before and after searching for them and trying out their solutions I am still unable to get it right.

Can anyone suggest what is wrong with my query?

Note: One thing I forgot to mention is that these lists can contains 1000s of records. So performance is also important.

Upvotes: 7

Views: 15047

Answers (6)

WholeLifeLearner
WholeLifeLearner

Reputation: 455

Using method chainining syntax for LINQ:

List<Card> result = cardDetails.Where(
    card => !transDetails.Exists(
         tran => tran.TxnDetails == card.TransactionRef.Split('|')[1]
)).ToList();

What's wrong with your query ?

 cardDetails = cardDetails.Where(x => transDetails.Any(y => x.TransactionRef.Contains(y.TxnDetails) == false)).ToList();

This is what you've written:

Find me all Cards that satisfy this condition: Is there any Transaction in my list of transactions that this particular Transaction has TxnDetails that cannot be found in TxnDetails of this particular Card ?

I can see problem here:

If any transaction has another TxnId than a Card (chances are quite high), return this Card.

So, basically you should get all cards from your query if your Transaction List has at least 2 different transaction ids in it

Upvotes: 2

Harald Coppoolse
Harald Coppoolse

Reputation: 30464

If performance is important, I suggest you should first give class Card a property that returns the part after the '|' character. Depending on how often you want to do this query compared with how often you construct a Card it might even be wist to let the constructor separate the transactionRef into a part before the '|' and a part after the '|'.

Whichever method you choose is not important for the query. Let's suppose class Card has a property:

string Guid {get {return ...;}

I understand that you want a sequence of all Cards from the sequence cardDetails that don't have a Guid that equals any of the TxnDetails of the Transactions in the sequence of transDetails.

Or in other words: if you would make a sequence of all used guids in TxnDetails, you want all Cards in CardDetails that have a guid that is not in the sequence of all used guids.

You could use Any() for this, but that would mean that you have to search the transDetails sequence for every card you'd want to check.

Whenever you have to check whetherany specific item is in a sequence or not, it is better to convert the sequence once to a Dictionary or a HashSet. Whichever you create depends on whether you only need the key or the element that has the key. Create the dictionary / hashset only once, and search very fast for the item with the key.

In our case we only want a sequence with used guids, it doesn't matter in which Transaction it is used.

var usedGuids = transDetails.Select(transDetail => transDetail.TxnDetails).Distinct();
var hashedGuids = new HashSet(usedGuids);

(I made two statement to make it easier to understand what is done)

Now, whenever I have a GUID I can check very fast if it is used or not:

bool guidIsUsed = usedGuids.Contains(myGuid);

So your sequence of Cards in cardDetails with a GUID that is not in transDetails is:

var hashedGuids = new HashSet(transDetails.Select(transDetail => transDetail.TxnDetails).Distinct());
var requestedCards = cardDetails.Where(card => !hashedGuids.Contains(card.Guid));

Upvotes: 0

dcastro
dcastro

Reputation: 68660

This should do it

var cards = 
    from card in cardDetails
    let txnDetails = GetTxnDetails(card)
    where ! transDetails.Any(t => t.TxnDetails == txnDetails)
    select card;


static string GetTxnDetails(Card card)
{
    return card.TransactionRef.Split('|')[1];
}

Fiddle: https://dotnetfiddle.net/b9ylFe


One way to optimize this a bit would be to store all the possible transaction details in a hash set upfront. The lookup should then be pretty close to O(1) (assuming a fair hashcode distributation) instead of O(n) - bringing the overall complexity of the algorithm from O(n * k) down to O(n + k).

var allTxnDetails = new HashSet<string>(transDetails.Select(t => t.TxnDetails));

var cards = 
    from card in cardDetails
    let txnDetails = GetTxnDetails(card)
    where ! allTxnDetails.Contains(txnDetails)
    select card;

Fiddle: https://dotnetfiddle.net/hTYCbj

Upvotes: 6

Saeb Amini
Saeb Amini

Reputation: 24400

This query should do the trick:

// Get all card details whose transactionrefs don't contain txndetails from the second list
cardDetails.Where(cd => transDetails.All(ts => !cd.TransactionRef.EndsWith(ts.TxnDetails)))
    .ToList();

But is there any specific reason why you are combining two pieces of data in one field? I suggest breaking the TransactionRef field in your Card class into two fields: TransactionDate and TransactionID to avoid string manipulation in queries.

Upvotes: 3

Rapha&#235;l Althaus
Rapha&#235;l Althaus

Reputation: 60493

it's just a parenthesis problem, the == false should come after )) not the first closing one.

cardDetails = cardDetails.Where(x => transDetails.Any(y => x.TransactionRef.Contains(y.TxnDetails)) == false).ToList();

Cause with your actual code, you just do the opposite of what you want !

you can also do

cardDetails = cardDetails.Where(x => !transDetails.Any(y => x.TransactionRef.Contains(y.TxnDetails))).ToList();

or any improvment suggested, but your code is basically really close from correct ;)

Upvotes: 1

Matthew Watson
Matthew Watson

Reputation: 109567

How about this?

var results = cardDetails.Where(
    card => !transDetails.Any(
        trans => card.TransactionRef.EndsWith("|" + trans.TxnDetails)));

Full demo:

using System;
using System.Linq;

namespace Demo
{
    class Card
    {
        public string CardID;
        public string TransactionRef;
    }

    class Transaction
    {
        public string TxnID;
        public string TxnDetails;
    }

    internal class Program
    {
        private static void Main()
        {
            var cardDetails = new[]
            {
                new Card {CardID = "1", TransactionRef = "20150824|Guid1"},
                new Card {CardID = "2", TransactionRef = "20150824|Guid2"},
                new Card {CardID = "3", TransactionRef = "20150824|Guid3"}
            };

            var transDetails = new[]
            {
                new Transaction {TxnID = "23", TxnDetails = "Guid1"},
                new Transaction {TxnID = "24", TxnDetails = "Guid2"}
            };

            var results = cardDetails.Where(card => !transDetails.Any(trans => card.TransactionRef.EndsWith("|" + trans.TxnDetails)));

            foreach (var item in results)
                Console.WriteLine(item.CardID + ": " + item.TransactionRef);    
        }
    }
}

Upvotes: 2

Related Questions