Kobrien
Kobrien

Reputation: 85

Find Duplicate entries C#

I am new to programming so this may seem somewhat straightforward, but I cannot seem to figure it out.

I am trying to find duplicate values that are in a datatable in one column of the values.

Here is what I was trying to do it with.

DataRow[] dupresults = dt.Select("PROV_NEW");
TableIssues = string.Empty;
DataTable dtTemp = dt.DefaultView.ToTable(true, "NEW_PROV");

if (dupresults.Length == 0)
{
    return true;
}
else
{
    foreach (DataRow item in dupresults)
    {
        Console.WriteLine(item[1]);
        TableIssues += "Provider Code is not unique for " + item[1].ToString() + ". Revise non-unique codes.\r\n\n\n\n";
    }
    return false;
}

Alright, but I am also having it search to make sure that there are no empty fields in PROV_NEW too. so I would not know where to put that. I am very new to c#. I just started last week. I am doing side projects for my father's company.

private bool ValidateTable(DataSets.Setup.SETUP_MWPROVDataTable dt, out string TableIssues)
    {
        try
        {
            //NewCode not used for other row
            DataRow[] result = dt.Select("PROV_NEW = ''");
            DataRow[] dupresults = dt.Select("PROV_NEW");
            TableIssues = string.Empty;
            DataTable dtTemp = dt.DefaultView.ToTable(true, "NEW_PROV");



            if (dupresults.Length == 0)
            {

                return true;
            }
            else
            {
                var duplicates = dt.AsEnumerable()
               .Select(dr => dr.Field<string>("PROV_NEW"))
               .GroupBy(x => x)
               .Where(g => g.Count() > 1)
               .Select(g => g.Key)
               .ToList();

                foreach (DataRow item in dupresults)
                {
                    Console.WriteLine(item[1]);
                    TableIssues += "Provider Code is not unique for " + item[1].ToString() + ". Revise non-unique codes.\r\n\n\n\n";
                }
                return false;
            }


            if (result.Length == 0)
            {
                //TODO: Add Next Step for validation

                return true;

            }
            else
            {
                foreach (DataRow item in result)
                {
                    Console.WriteLine(item[1]);
                    TableIssues += "Provider code " + item[1].ToString() + " is blank. Add new Provider code for " + item[1].ToString() +".\r\n\n\n";
                }


                return false;
            }

           }
        catch (Exception)
        {

            throw;
        }
    }


}

Upvotes: 5

Views: 24567

Answers (3)

Jon Skeet
Jon Skeet

Reputation: 1500675

LINQ can help you here:

var duplicates = dt.AsEnumerable()
                   .Select(dr => dr.Field<string>("PROV_NEW"))
                   .GroupBy(x => x)
                   .Where(g => g.Count() > 1)
                   .Select(g => g.Key)
                   .ToList();

// Now work with the set of duplicates

Alternatively:

HashSet<string> providers = new HashSet<string>();
foreach (var provider in dt.AsEnumerable()
                           .Select(dr => dr.Field<string>("PROV_NEW")))
{
    if (!providers.Add(provider))
    {
        // This provider is a duplicate
    }
}

(This works because HashSet<T>.Add returns false if the value already exists in the set.)

Upvotes: 20

Sajishanoop
Sajishanoop

Reputation: 21

dtEmp is your working datatable

DataTable distinctTable = dtEmp.DefaultView.ToTable( /*distinct*/ true);

Upvotes: 1

Deepak Sahu
Deepak Sahu

Reputation: 546

Taking Jon Skeet's advice on using anonymous type LINQ on MULTIPLE COLUMN SELECTION, I got myself a solution, Hope it helps you too:

DataTable dt_ = _data.Tables["MyTable"];

        foreach (DataRow _dr in dt_.AsEnumerable()
        .GroupBy(r => new
        {
            c1 = r.Field<string>("ColNAME1 of table dt_"),
            c2 = r.Field<string>("ColNAME2 of table dt_"),
            c3 = r.Field<string>("ColNAME3 of table dt_"),
    ...<any number of columns can be added> 
        }).Where(grp => grp.Count() > 1).SelectMany(itm => itm))
        {
        // Handle your Duplicate row entry
        }

Upvotes: 4

Related Questions