LewSim
LewSim

Reputation: 327

Upload csv file, parse, then linsert data into SQL table

Have a webform that has an upload button to upload csv file then my code needs to parse the file and use the parsed data to insert into a SQL table. Is what I'm doing correct for the parse data to a List, it's not picking up the filename for the streamreader. Is this the most effective way to parse the data? Should I parse into to a datatable?

protected void UploadBtn_Click(object sender, EventArgs e)
{
    if (FileUpload.HasFile)
    {

        string filename = Path.GetFileName(FileUpload.FileName);
        List<string[]> ValuesToUpload = parseData(filename);

        //if (!Directory.Exists(ConfigurationManager.AppSettings["temp_dir"].ToString().Trim()))
        //{
        //    Directory.CreateDirectory(ConfigurationManager.AppSettings["temp_dir"].ToString().Trim());
        //}
        //FileUpload.SaveAs(ConfigurationManager.AppSettings["temp_dir"].ToString().Trim() + filename);
        //using (FileStream stream = new FileStream(ConfigurationManager.AppSettings["temp_dir"].ToString().Trim() + filename, FileMode.Open, FileAccess.Read, FileShare.Read))
    }
}
      public List<string[]> parseData(filename)
            {
             int j=0;
             List <string[]> members = new List<string[]>();
             try
             {
                using (StreamReader read = new StreamReader(filename))
                {
                    while (!read.EndOfStream)
                    {
                        string line = read.ReadLine();
                        string[] values = line.Split(',');
                        if(j==0)
                        {
                            j++;
                            continue;
                        }

                        long memnbr = Convert.ToInt64(values[0]);
                        int loannbr = Convert.ToInt32(values[1]);
                        int propval = Convert.ToInt32(values[2]);
                        members.Add(values);
                    }

Upvotes: 1

Views: 2927

Answers (4)

Dennis
Dennis

Reputation: 20571

The below code sample will bulk insert CSV data into a staging table that has matching columns and then will execute a Stored Procedure to normalize the data on the server.

This is significantly more efficiently than manually parsing the data and inserting the data line-by-line. A few months ago I used similar code to submit 1,500,000+ records to our database and normalize the data in a matter of seconds.

var sqlConnection = new SqlConnection(DbConnectionStringInternal);

// Bulk-import our unnormalized data from the .csv file into a staging table
var inputFileConnectionString = String.Format("Driver={{Microsoft Text Driver (*.txt; *.csv)}};Extensions=csv;Readonly=True;Dbq={0}", Path.GetDirectoryName(csvFilePath));
using (var inputFileConnection = new OdbcConnection(inputFileConnectionString))
{
    inputFileConnection.Open();

    var selectCommandText = String.Format("SELECT * FROM {0}", Path.GetFileName(csvFilePath));
    var selectCommand = new OdbcCommand(selectCommandText, inputFileConnection);
    var inputDataReader = selectCommand.ExecuteReader(CommandBehavior.CloseConnection);

    var sqlBulkCopy = new SqlBulkCopy(sqlConnection) { DestinationTableName = "Data_Staging" };    
    if (sqlConnection.State != ConnectionState.Open)
        sqlConnection.Open();

    sqlBulkCopy.WriteToServer(inputDataReader);    
}

// Run a stored-procedure to normalize the data in the staging table, then efficiently move it across to the "real" tables.
var addDataFromStagingTable = String.Format("EXEC SP_AddDataFromStagingTable");
if (sqlConnection.State != ConnectionState.Open)
    sqlConnection.Open();

using (var addToStagingTableCommand = new SqlCommand(addDataFromStagingTable, sqlConnection) { CommandTimeout = 60 * 20 })
    addToStagingTableCommand.ExecuteNonQuery();    

sqlConnection.Close();

Upvotes: 0

Victor Zakharov
Victor Zakharov

Reputation: 26424

Use KBCsv. We are getting 40K rows parsed per second, and 70K+ rows skipped per second. This is the fastest I have seen. And also pretty stable. Then generate SQL manually as suggested above. If doing data reload and aim for performance, run multi-threaded, no transaction (MS SQL only). Can get up to 10K rows per second of import speed, depending on your network bandwidth to database server.

Do not parse to DataTable - it is very slow.

Upvotes: 1

spender
spender

Reputation: 120460

Not really an answer, but too long to post as a comment...

As it looks like you're throwing away your parsed values (memnbr, etc...), you could significantly reduce your csv parsing code to:

return 
 File
   .ReadLines(filename)
   .Skip(1)
   .Select(line => line.Split(','))
   .ToList();

Upvotes: 0

NPToita
NPToita

Reputation: 238

Since you're going to insert the data into the SQL table, I'd first create a class that represents the table and create a new object for each record. (this is for visibility).

or I could use the following approaches (assuming you're using MS SQL Server) 1. The Dynamic Insert Query

StringBuilder strInsertValues = new StringBuilder("VALUES");

your ParsingCode HERE..
string [] values = line.Split(',');
strInsertValues.AppendFormat("({0},{1},{2}),", values[0], values[1], values[2]);

end parse


using(SqlConnection cn = new SqlConnection(YOUR_CONNECTION_STRING)){
    SqlCommand cmd = cn.CreateCommand;
    cmd.CommandType = SqlCommandType.Text;
    cmd.CommandText = "INSERT INTO TABLE(Column1, Column2, Column3) " + strInsertValues.ToString().SubString(0, strInsertValues.Length);
    cn.Open();
    cmd.ExecuteNonQuery();

}

2. Use BulkCopy (Recommanded) Create a DataSet the represents your CSV values Add a new record for each line parsed Create Column Mappings for your DataSet and SQL Table, Use BulkCopy Object to insert your data. Ref to BulkCopy: http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy.aspx

Upvotes: 0

Related Questions