Mangrio
Mangrio

Reputation: 1030

C# - Copy rows from one DataTable by comparing with 2nd DataTable to 3rd DataTable

I have 4 DataTables in my C# code;

DataTable dt1;
DataTable dt2;
DataTable dt3;
DataTable dt4;

dt1 has following records

Id | Name | City
----------------
1 | Abc | Khi
2 | XYZ | Hyd
3 | TVW | Lhr
4 | tyz | Isb

dt2 has just ID records

Id
---
2
4

I need to copy only rows from dt1 which are also available in dt2 to dt3. also required to store not matching records in 4th dt4

I found this but does not know how to copy to datatable properly and also how not equals works here

var matchingRows = from s1 in dt1.AsEnumerable()
                               join s2 in d2.AsEnumerable() on s1.Field<int>("Id") equals s2.Field<int>("Id")
                               select s1;

Upvotes: 1

Views: 58

Answers (2)

NetMage
NetMage

Reputation: 26927

You can create two queries to find the matching and non-matching rows and then copy the rows into each answer DataTable:

var dt3 = dt1.Clone();
var dt4 = dt1.Clone();

var matchingRows = from s1 in dt1.AsEnumerable()
                   join s2 in dt2.AsEnumerable() on s1.Field<int>("Id") equals s2.Field<int>("Id")
                   select s1;

matchingRows.CopyToDataTable(dt3, LoadOption.OverwriteChanges);

var nonMatchingRows = from s1 in dt1.AsEnumerable()
                      where !dt2.AsEnumerable().Any(s2 => s2.Field<int>("Id") == s1.Field<int>("Id"))
                      select s1;

nonMatchingRows.CopyToDataTable(dt4, LoadOption.OverwriteChanges);

If dt2 has a large number of entries and performance is more important than memory, you could convert dt2 to a HashSet to optimize the match testing:

var dt2hs = new HashSet<int>(dt2.AsEnumerable().Select(r => r.Field<int>("Id")));

var matchingRows = from s1 in dt1.AsEnumerable()
                   where dt2hs.Contains(s1.Field<int>("Id"))
                   select s1;

var nonMatchingRows = from s1 in dt1.AsEnumerable()
                      where !dt2hs.Contains(s1.Field<int>("Id"))
                      select s1;

Another performance option would be to just scan dt1 once and then manually loop and select each destination:

var dt1jdt2 = from s1 in dt1.AsEnumerable()
              join s2 in dt2.AsEnumerable() on s1.Field<int>("Id") equals s2.Field<int>("Id") into s2j
              from s2 in s2j.DefaultIfEmpty()
              select new { s1, s2 };

foreach (var r in dt1jdt2) {
    if (r.s2 != null)
        dt3.Rows.Add(r.s1.ItemArray);
    else
        dt4.Rows.Add(r.s1.ItemArray);
}

Of course you can combine the HashSet and single scan options:

var dt2hs = new HashSet<int>(dt2.AsEnumerable().Select(r => r.Field<int>("Id")));

foreach (var s1 in dt1.AsEnumerable())
    if (dt2hs.Contains(s1.Field<int>("Id")))
        dt3.Rows.Add(s1.ItemArray);
    else
        dt4.Rows.Add(s1.ItemArray);

Upvotes: 2

jdweng
jdweng

Reputation: 34421

I do not like using var when it is not necessary. You are loosing track of what you are actually doing. mathcingRows is a List. So the task is simple :

           List<DataRow> matchingRows = (from s1 in dt1.AsEnumerable()
                                          join s2 in dt2.AsEnumerable() on s1.Field<int>("Id") equals s2.Field<int>("Id")
                                          select s1).ToList();
            foreach (DataRow row in matchingRows)
            {
                dt4.Rows.Add(row);
            }

Upvotes: 2

Related Questions