Daniel Waghorn
Daniel Waghorn

Reputation: 2985

Using LINQ to find updated rows in DataTable

I'm building an application in VB.NET where I am pushing data from one database to another. The source database is SQL Server and the target is MySQL.

What I am doing is first creating DataTables for each table in each database which I use to do a comparison. I've written the queries in such a way so that the source and target DataTables contain exactly the same columns and values to make the comparison easier.

This side of the application works fine. What I do next is find rows which do not exist in the target database by finding PKs which do not exist. I then insert these new rows into the target database with no problem.

The Problem

What I now need to do is find rows in each table that have been updated, i.e. are not identical to the corresponding rows in the target DataTable. I have tried using Except() as per the example below:

Public Function GetUpdates(ByVal DSDataSet As MSSQLQuery, ByVal AADataSet As MySQLQuery, Optional ByVal PK As String = Nothing) As List(Of DataRow)
    ' Determines records to be updated in the AADB and returns list of new Rows
    ' Param DSDataSet - MSSQLQuery Object for source table
    ' Param AADataSet - MySQLQuery Object for destination table
    ' Optional Param PK - String of name common columns to treat as PK
    ' Returns List(Of DataRow) containing rows to update in table

    Dim orig = DSDataSet.GetDataset()
    Dim origTable = orig.Tables(0).AsEnumerable()
    Dim destination = AADataSet.GetDataset()
    Dim destinationTable = destination.Tables(0).AsEnumerable()

    ' Get Records which are not in destination table
    Dim ChangedRows = Nothing

    If IsNothing(PK) Then
        ChangedRows = destinationTable.AsEnumerable().Except(origTable.AsEnumerable(), DataRowComparer.Default)
    End If

    Dim List As New List(Of DataRow)

    For Each addRow In ChangedRows
        List.Add(addRow)
    Next

    Return List
End Function

The trouble is that it ends up simply returning the entire set of source rows.

How can I check for these changed rows? I could always hardcode queries to return what I want but this introduces problems because I need to make comparisons for 15 tables so it would be a complete mess.

Ideally I need a solution where it will take into account the variable number columns from the source tables for comparison against what is essentially an identical target table and simply compare the DataRows for equality.

There should be a corresponding row in the target tables for every source row since the addition of new rows is performed prior to this check for updated rows.

I am also open to using methods other than LINQ to achieve this.


Solution

In the end I implemented a custom comparer to use in the query as shown below. It first checks if the first column value matches (PK in my case) where if it does then it we check column-wise that everything matches.

Any discrepancy will set the flag value to FALSE which we return. If there aren't any issues then TRUE will be returned. In this case I used = to compare equality between values rather than Equals() since I'm not concerned about a strict equality.

The resulting set of DataRows is used to UPDATE the database using the first column value (PK) in the WHERE clause.

Imports System.Data

Class MyDataRowComparer
  Inherits EqualityComparer(Of DataRow)

Public Overloads Overrides Function Equals(x As DataRow, y As DataRow) As Boolean
    If x.Item(0).ToString().Equals(y.Item(0).ToString()) Then
        ' If PK matches then check column-wise.
        Dim Flag As Boolean = True
        For Counter As Integer = 0 To x.ItemArray.Count - 1
            If Not x.Item(Counter) = y.Item(Counter) Then
                Flag = False
            End If
        Next
        Return Flag
    Else
        ' Otherwise don't bother and just skip.
        Return False
    End If
End Function

...
End Class

Upvotes: 2

Views: 797

Answers (1)

Mrinal Kamboj
Mrinal Kamboj

Reputation: 11478

class MyDataRowComparer : IEqualityComparer<DataRow>
{      

    public bool Equals(DataRow x, DataRow y)
    {
        return x["ColumnName"].Equals(y["ColumnName"]);
        // Can add more columns to the Comparison
    }

    public int GetHashCode(DataRow obj)
    {
        return obj["ColumnName"].GetHashCode();
       // Can add more columns to calculate HashCode
    }    
}

Now the Except statement will be like:

ChangedRows = destinationTable.AsEnumerable()
             .Except(origTable.AsEnumerable(), MyDataRowComparer)

Upvotes: 1

Related Questions