Hakuna N
Hakuna N

Reputation: 181

Compare two DataTables and select first table records NOT in the second table

I have two DataTables that are getting data from two different sources. I would want to have a new DataTable Table3 with ONLY Tables1 rows that do not have matching Table2 rows. My two DataTables have different structures as indicated below. However, they all share the same primary key - CarReg. So, I would like to compare rows using the CarReg column and return all the rows that are in Table1 but NOT in Table2. My returned Table3 DataTable will have same structure as Table1.

Table1.Columns.Add("CarReg", typeof(string));
Table1.Columns.Add("Site", typeof(string));
Table1.Columns.Add("Route", typeof(double));
Table1.Columns.Add("Driver", typeof(string));
Table1.Columns.Add("StartingDate", typeof(string));

Table2.Columns.Add("CarReg", typeof(string));
Table2.Columns.Add("SITE DESC", typeof(string));
Table2.Columns.Add("Route DESC", typeof(double));
Table2.Columns.Add("Driver", typeof(string));
Table2.Columns.Add("KILOS", typeof(string));

I have tried the below, however, I am getting all records from the first table Table1. I need to get only records that in Table1 but are not in Table2. So, for example if Table1 have 20 records and Tables2 have 15 records, I need to have only the 5 records. Please help.

var recordsNotInB = TableA.AsEnumerable().Select(r =>r.Field<string>("CarReg").Except(TableB.AsEnumerable().Select(r => r.Field<string>("CarReg")));

Upvotes: 1

Views: 2319

Answers (4)

Ognjen Babic
Ognjen Babic

Reputation: 767

You can try it with next code snippet:

        var result = new List<DataRow>();
        //convert to list to avoid multiple enumerations
        var table2List = Table2.AsEnumerable().ToList();

        foreach(var row in Table1.AsEnumerable())
        {
            var matchingRow = table2List.FirstOrDefault(x => x["CarReg"] == row["CarReg"]);

            if(matchingRow == null)
            {
                result.Add(row);
            }
        }  

You should get collection of DataRows from Table1 that are not in Table2 based on CarReg field.

Upvotes: 0

Yawar Murtaza
Yawar Murtaza

Reputation: 3865

Okay this is a long winded solution but i have tested it and it works.

I have done it by creating POCO objects for each Table then taken the differnece as its easier to play with public properties than using Table1["CarReg"] which is prone to errors.

May be someone can improve this solution.

Let your tables to be represented by POCO classes like:

public class Table1
{
    public string CarReg { get; set; }
    public string Site { get; set; }
    public double Route { get; set; }
    public string Driver { get; set; }
    public string DateString { get; set; }
}

public class Table2
{
    public string CarReg { get; set; }
    public string Site { get; set; }
    public double Route { get; set; }
    public string Driver { get; set; }
    public string Kilos { get; set; }
}

Lets populate the data:

IEnumerable<Table1> data1 = new List<Table1>()
            {
                new Table1() { CarReg = "123ABC", DateString = "20/02/2018", Driver = "Driver 1", Route = 45.45, Site = "England" },
                new Table1() { CarReg = "456ABC", DateString = "20/03/2018", Driver = "Driver 2", Route = 55.45, Site = "Scotland" },
                new Table1() { CarReg = "789ABC", DateString = "20/04/2018", Driver = "Driver 3", Route = 65.45, Site = "Wales" },
            };

            IEnumerable<Table2> data2 = new List<Table2>() {
                new Table2() { CarReg = "123XYZ", Kilos = "34KG", Driver = "Driver 5", Route = 45.45, Site = "Karachi" },
                new Table2() { CarReg = "456ABC", Kilos = "44KG", Driver = "Driver 2", Route = 55.45, Site = "Scotland" },
                new Table2() { CarReg = "789CCC", Kilos = "54KG", Driver = "Driver 7", Route = 65.45, Site = "Hyderabad" },
            };

Create a list of to hold the result data:

 List<Table1> oneList = new List<Table1>();

Loop through to identify the rows that are not in Table2

  bool matchFound = false; // an indicator if match is found in table 2

        foreach (var item in data1)
        {
            foreach (var item2 in data2)
            {
                if (item.CarReg != item2.CarReg)
                {
                    matchFound = false;
                }
                else
                {
                    matchFound = true;
                    break;
                }
            }
            if (!matchFound)
            {
                if (!oneList.Contains(item))
                {
                    oneList.Add(item);
                }
            }

        }

Hope this help!

Upvotes: 0

Georg
Georg

Reputation: 2096

You can improve performance and also specify the comparison method using a hash set.

var idsFromTableB = new HashSet<string>(TableB.AsEnumerable()
    .Select(tb => tb.Field<string>("CarReg")), StringComparer.OrdinalIgnoreCase);

var recordsNotInB = TableA.AsEnumerable()
    .Where(ta => !idsFromTableB.Contains(ta.Field<string>("CarReg")));

Upvotes: 1

yan.kun
yan.kun

Reputation: 6908

Wihtout knowing any more of your code, I would come up with this.

var idsFromTableB = TableB.AsEnumerable().Select(tb => tb.Field<string>("CarReg"));
var recordsNotInB = TableA.AsEnumerable().Where(ta => !idsFromTableB.Contains(ta.Field<string>("CarReg")));

Upvotes: 1

Related Questions