Lynn
Lynn

Reputation: 4408

Combine csv files within SQL and then update to existing table

I have 2 csv files, which I will call table1 and table2

table1

Name   Location
Jim    NY

table2

Name  Location
Rick  CA

I would like to combine these vertically:

Desired output:

table3

Name   Location
Jim    NY
Rick   CA

Once combined vertically as table 3, I would like to now combine table3 to my existing table within SQL server (table4)

table4 (already existing within SQL Server)

Name    Location
Lisa    TX

Final output

Name    Location
Lisa    TX
Jim     NY
Rick    CA

This is what I have done:

I realize that I can add the values into existing tables (from a csv file) using the BULK INSERT statement:

 BULK INSERT test
 FROM 'C:\Test\test1.csv'
 WITH
   (
     FIRSTROW = 2,
     rowterminator='\n',
     fieldterminator=','
   )

I am not sure how to initially combine table1 and table2 within SQL, and then add this data to table4

Any insight is appreciated

Upvotes: 0

Views: 4445

Answers (2)

Parfait
Parfait

Reputation: 107737

Consider also OPENROWSET or OPENDATASOURCE (assuming user has permission) to query CSVs like a table and integrate into an insert-select query. This would only work for SQL Server on Windows machines.

INSERT INTO table4 ([Name], [Location], ...)
SELECT t.[Name], t.[Location], ...
FROM OPENROWSET
   (  'MSDASQL'
     ,'Driver={Microsoft Access Text Driver (*.txt, *.csv)}'
     ,'select * from C:\Path\To\table1.csv'
   ) AS t
WHERE NOT EXISTS
  (SELECT 1 FROM table4
   WHERE table4.[Name] = t.[Name]
     AND table4.[Location] = t.[Location])

You can even use the MS Access engine OLEDB provider which can query Excel workbooks, Access database tables, and text files. Do note below points database to folder of CSV and treats the CSV files as table. Also, provider can vary: 12.0, 15.0, even 16.0.

INSERT INTO table4 ([Name], [Location], ...)
SELECT t.[Name], t.[Location], ...
FROM OPENDATASOURCE('Microsoft.ACE.OLEDB.12.0',
                    'Text;Database=C:\Path\To\CSV;HDR=YES')..[table2#csv] AS t
WHERE NOT EXISTS
  (SELECT 1 FROM table4
   WHERE table4.[Name] = t.[Name]
     AND table4.[Location] = t.[Location])

You can check available OLEDB drivers with Powershell including MSDASQL and Microsoft.ACE.OLEDB.##.0. (remove .IndexOf('SQL') on last line for all listings).

Upvotes: 2

SteveC
SteveC

Reputation: 6015

You could do something like this. If the files are very large then its logging could be very large as well.

set xact_abort on;

begin transaction
begin try

    bulk insert table4
    from 'C:\Test\test1.csv'
     WITH  (
         FIRSTROW = 2,
         rowterminator='\n',
         fieldterminator=',',
         tablock);

    bulk insert table4
    from 'C:\Test\test2.csv'
     WITH  (
         FIRSTROW = 2,
         rowterminator='\n',
         fieldterminator=',',
         tablock)

    commit transaction
end try
begin catch
    rollback transaction
end catch

Upvotes: 1

Related Questions