Mohammad Hussain
Mohammad Hussain

Reputation: 77

Bulk insert .csv without the last row

I have a .csv file with data with a number of rows (please see image)

data in .csv file

This data is provided by a third party and I cannot change the format.

I am BULK inserting this data into an SQL server table using ms SQL server management.

when i BULK insert using the following:

    BULK INSERT #tempDERIVEDDATA
    FROM 'C:\MyDownloads\Data.csv'
    WITH
    (FIRSTROW = 1,
    FIELDTERMINATOR = ',',
    ROWTERMINATOR = '0x0a',
    ERRORFILE = 'C:\MyDownloads\logfile.log')
    GO

I seem to get the following error:

Msg 4832, Level 16, State 1, Line 236 Bulk load: An unexpected end of file was encountered in the data file. Msg 7399, Level 16, State 1, Line 236 The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error. Msg 7330, Level 16, State 2, Line 236 Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".

When the last row which is the footer row and states the number of rows in the file is deleted the bulk insert works. So i cannot even delete that row once putting into a table as struggling to do such.

So then I thought what if use an OPENROWSET function and count the number of rows and insert (#rows - 1) into the table?

So i tried the below to calculated the number of rows:

   DECLARE @lastrow INT
   SET @lastrow = (SELECT COUNT(*) FROM OPENROWSET(BULK 'C:\MyDownloads\DATA1.csv', 
   FORMATFILE = 'C:\MyDownloads\format.fmt',
   MAXERRORS=10) AS  ) - 1
   SELECT @lastrow

The select last row should retrieve (#rows - 1) should retrieve number of rows but i get an error.

Msg 4832, Level 16, State 1, Line 230 Bulk load: An unexpected end of file was encountered in the data file. Msg 7399, Level 16, State 1, Line 230 The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error. Msg 7330, Level 16, State 2, Line 230 Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".

Format file is per below: FORMAT.FMT

After which I decided to do above with a format file, instead SINGLE_CLOB. This time there is no errors but the file retrieves '0' lines when it should retrieve '341'.

I would appreciate if someone could assist.

Upvotes: 2

Views: 5136

Answers (3)

youcantryreachingme
youcantryreachingme

Reputation: 1155

Create a temporary table with a single wide column and insert the whole row into that column for the whole file, then use select+insert to select from your temporary table, separating fields via the separator character as you select, and exclude the last row. Finally, drop the temporary table in prep for the next import.

Upvotes: 1

ASH
ASH

Reputation: 20342

Hummm, I had the exact same issue back in 2015. Do you work at CapIQ, by chance? Anyway, I ended up creating a C# executable and running that to clean up the file before running the Bulk Insert script to load the data. I actually saved that solution because I thought I would need it again someday. Here is the solution.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Diagnostics;


namespace ConsoleApplication3
{
    class Program
    {
        static void Main(string[] args)
        {

            string sourceDirectory = @"C:\ManagedCode\Downloads\";
            try
            {
                var txtFiles = Directory.EnumerateFiles(sourceDirectory);

                foreach (string currentFile in txtFiles)
                {
                    //foreach (var currentFile in files)
                    {
                        var lines = File.ReadAllLines(currentFile);

                        //Just an example of changing the filename based upon the current name
                        var targetFile = Path.ChangeExtension(currentFile, "prod.txt");
                        File.WriteAllLines(targetFile, lines.Skip(1).Take(lines.Count() - 2));
                        //};

                    }

                }
            }

            catch (Exception ex)
            {

            }

        }
    }
}

You can run this executable (and any executable) on any schedule of your choosing using Windows Task Manager.

Upvotes: 0

vicosanz
vicosanz

Reputation: 57

Consider use openrowset.

    create procedure [dbo].LoadData
    (
            @file nvarchar(200)=null
    )
    as
    begin
            declare @sql varchar(max)
            set @sql='INSERT INTO DBO.Table
            (
                    column1, column2
            )
            select column1
                                    ,column2
            from openrowset(bulk N''' + @file +''',
             formatfile = N''d:\xml\Table.xml'') as t1
             where not column1 like (''Total'')'

             print @sql
             exec (@sql)
    end


    --- Table.xml

    <?xml version="1.0"?>
    <BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
     <RECORD>
      <FIELD ID="1" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="100" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
      <FIELD ID="2" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="50" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
     </RECORD>
     <ROW>
      <COLUMN SOURCE="1" NAME="column1" xsi:type="SQLNVARCHAR"/>
      <COLUMN SOURCE="2" NAME="column2" xsi:type="SQLNVARCHAR"/>
     </ROW>
    </BCPFORMAT>

Upvotes: 0

Related Questions