xyz
xyz

Reputation: 549

Insert only the duplicate records from csv in to the Database

I am working on working on validating the csv before inserting them in to the database. I have done the piece of inserting them in to the database but not sure how can I validate for duplicate record and insert only the duplicate records in to the database.The Csv looks like below and we can receive the ProdID as duplicates in CSV

ProdID,Name,Color,Availability
P01,Product1,Red,Yes
P02,Product2,Blue,Yes
P03,Product3,Yellow,No
P01,Product4,Red,Yes
P04,Product5,Black,Yes

The table has five fields and one of the field is is the Duplicate field

class Products
{
    static void Main(string[] args)
    {
        string location = "\Products.csv";
        List<Products> ProductsList= ReadInCSV(location);
        var connStr = ConfigurationManager.ConnectionStrings["ProgConnectionString"].ConnectionString;

        // Need to validate them before putting them in to the table 

        string sqlString = "INSERT INTO PRODUCTS (ProdID,Name,Color,Availability,Duplicate) VALUES (@ProdID,@Name,@Color,@Availability,@Duplicate );";
        for (int i = 0; i < ProductsList.Count; i++)
        {
            using (MySqlConnection conn = new MySqlConnection(connStr))
            using (MySqlCommand comm = new MySqlCommand(sqlString, conn))
            {
                comm.Parameters.Add("@ProdID", MySqlDbType.VarChar).Value = ProductsList[i].ProdID;
                comm.Parameters.Add("@Name", MySqlDbType.VarChar).Value = ProductsList[i].Name;
                comm.Parameters.Add("@Color", MySqlDbType.VarChar).Value = ProductsList[i].Color;
                comm.Parameters.Add("@Availability", MySqlDbType.VarChar).Value = ProductsList[i].Availability;
                comm.Parameters.Add("@Duplicate", MySqlDbType.VarChar).Value = "Yes";
                conn.Open();
                comm.ExecuteNonQuery();
                conn.Close();
            }
        }
    }
}

I am not sure how can I check for the duplicate records in the csv and do an insert only to the records that are duplicate.

Upvotes: 0

Views: 66

Answers (2)

maccettura
maccettura

Reputation: 10818

Based on your comment you only want to insert the two records that are duplicated. In your example p01.

This can be solved with a simple LINQ query:

ProductsList.GroupBy(p => p.ProdID).Where(p => p.Count() > 1).SelectMany(x => x);

This will group the items by their respective ProdID's, then we select only the instances where the group contains more than one instance.

Use this in place of your for loop:

foreach(var prod in ProductsList.GroupBy(p => p.ProdID).Where(p => p.Count() > 1).SelectMany(x => x))
{
    using (MySqlConnection conn = new MySqlConnection(connStr))
    using (MySqlCommand comm = new MySqlCommand(sqlString, conn))
    {
        comm.Parameters.Add("@ProdID", MySqlDbType.VarChar).Value = prod.ProdID;
        comm.Parameters.Add("@Name", MySqlDbType.VarChar).Value = prod.Name;
        comm.Parameters.Add("@Color", MySqlDbType.VarChar).Value = prod.Color;
        comm.Parameters.Add("@Availability", MySqlDbType.VarChar).Value = prod.Availability;
        comm.Parameters.Add("@Duplicate", MySqlDbType.VarChar).Value = "Yes";
        conn.Open();
        comm.ExecuteNonQuery();
        conn.Close();
    }
}

I have created a fiddle to demo

Upvotes: 3

Techno
Techno

Reputation: 180

I think you can add "where @duplicate like 'yes'" in the Query.

Upvotes: 0

Related Questions