SarangArd
SarangArd

Reputation: 1165

Import Multiple CSV Files to SQL Server from a Folder

I have a folder called "Dump." This folder consists of various .CSV Files. The folder Location is 'C:\Dump'

I want to Import the contents of these files into SQL Server. I want the rough code along with proper comments so that I understand it.

I have tried a few codes that I found on the Net. But they haven't quite worked out for me for some strange reason.


The steps I would like to have are

Step 1: Copy all the File Names in the folder to a Table

Step 2: Iterate through the table and copy the data from the files using Bulk Insert.


Someone do please help me out on this one. Thanks a lot in advance :)

Upvotes: 26

Views: 121409

Answers (8)

Andrey K47
Andrey K47

Reputation: 31

You might need to enable the xp_cmdshell first:

sp_configure 'show advanced options', '1'
RECONFIGURE
go
sp_configure 'xp_cmdshell', '1' 
RECONFIGURE
go

And, to enable ad_hoc,

sp_configure 'show advanced options', 1;
RECONFIGURE;
GO
sp_configure 'Ad Hoc Distributed Queries', 1;
RECONFIGURE;
GO

Upvotes: 2

Andrey K47
Andrey K47

Reputation: 31

This code will create a new table per CSV file that is imported. Best to populate empty database from CSV files.

CREATE TABLE ALLFILENAMES
(
             WHICHPATH VARCHAR(255)
            ,WHICHFILE VARCHAR(255)
)

DECLARE @filename VARCHAR(255),
        @path     VARCHAR(255),
        @sql      VARCHAR(8000),
        @cmd      VARCHAR(1000)

SET @path = 'L:\DATA\SOURCE\CSV\'  --PATH TO YOUR CSV FILES (CHANGE TO YOUR PATH)
SET @cmd = 'dir ' + @path + '*.csv /b'

INSERT INTO ALLFILENAMES(WHICHFILE)
EXEC Master..xp_cmdShell @cmd

UPDATE ALLFILENAMES
  SET WHICHPATH = @path
WHERE WHICHPATH IS NULL

DECLARE c1 CURSOR
FOR SELECT WHICHPATH
          ,WHICHFILE
    FROM ALLFILENAMES
    WHERE WHICHFILE LIKE '%.csv%'

OPEN c1

FETCH NEXT FROM c1 INTO @path,
                        @filename

WHILE @@fetch_status <> -1
BEGIN
    CREATE TABLE #Header
    (
                 HeadString NVARCHAR(MAX)
    )
    DECLARE @Columns NVARCHAR(MAX) = ''
    DECLARE @Query NVARCHAR(MAX) = ''
    DECLARE @QUERY2 NVARCHAR(MAX) = ''
    DECLARE @HeaderQuery NVARCHAR(MAX) = ''

    SELECT @HeaderQuery = @HeaderQuery + 'bulk insert #Header from ''' + @path + @filename + '''  
        with(firstrow=1,lastrow=1)'
    EXEC (@HeaderQuery)

    SELECT @Columns = (SELECT QUOTENAME(value) + ' nvarchar(max)' + ','
                       FROM #Header
                            CROSS APPLY STRING_SPLIT(HeadString,',') FOR xml PATH(''))
    IF ISNULL(@Columns,'') <> ''
    BEGIN
        SET @Columns = LEFT(@Columns,LEN(@Columns) - 1)

        SELECT @Query = @Query + 'CREATE TABLE ' + Replace(@filename,'.csv','') + ' (' + replace(@Columns,'"','') + ')'
        PRINT @Query
        EXEC (@QUERY)
    END

    SELECT @QUERY2 = @QUERY2 + 'bulk insert ' + replace(Replace(@filename,'.csv',''),'.TPS','') + ' from ''' + @path + @filename + '''  
        with(firstrow=2,FORMAT=''csv'',FIELDTERMINATOR='','',ROWTERMINATOR=''\n'')'
    EXEC (@QUERY2)

    DROP TABLE #Header
    FETCH NEXT FROM c1 INTO @path,
                            @filename
END

CLOSE c1

DEALLOCATE c1

Upvotes: 1

Joshua Sayers
Joshua Sayers

Reputation: 51

This will give you separate tables for each file.

--BULK INSERT MULTIPLE FILES From a Folder 
drop table allfilenames
--a table to loop thru filenames drop table ALLFILENAMES
CREATE TABLE ALLFILENAMES(WHICHPATH VARCHAR(255),WHICHFILE varchar(255))

--some variables
declare @filename varchar(255),
        @path     varchar(255),
        @sql      varchar(8000),
        @cmd      varchar(1000)


--get the list of files to process:
SET @path = 'D:\Benihana\backup_csv_benihana_20191128032207_part_1\'
SET @cmd = 'dir ' + @path + '*.csv /b'
INSERT INTO  ALLFILENAMES(WHICHFILE)
EXEC Master..xp_cmdShell @cmd
UPDATE ALLFILENAMES SET WHICHPATH = @path where WHICHPATH is null

delete from ALLFILENAMES where  WHICHFILE is null
--SELECT replace(whichfile,'.csv',''),* FROM dbo.ALLFILENAMES


--cursor loop
declare c1 cursor for SELECT WHICHPATH,WHICHFILE FROM ALLFILENAMES where WHICHFILE like '%.csv%' order by WHICHFILE desc
open c1
fetch next from c1 into @path,@filename
While @@fetch_status <> -1
  begin
  --bulk insert won't take a variable name, so make a sql and execute it instead:
   set @sql = 

   'select * into '+ Replace(@filename, '.csv','')+'
    from openrowset(''MSDASQL''
    ,''Driver={Microsoft Access Text Driver (*.txt, *.csv)}''
    ,''select * from '+@Path+@filename+''')' 


print @sql
exec (@sql)

  fetch next from c1 into @path,@filename
  end
close c1
deallocate c1

Upvotes: 5

TheOriginalCole
TheOriginalCole

Reputation: 97

to expand upon the answer by SarangArd you can replace temp with the following if your file name matches your table name.

' + Left(@filename, Len(@filename)-4) + '

Upvotes: 0

citynorman
citynorman

Reputation: 5294

In python you can use d6tstack which makes this simple

import d6tstack
import glob

c = d6tstack.combine_csv.CombinerCSV(glob.glob('*.csv'))
c.to_mssql_combine('mssql+pymssql://usr:pwd@localhost/db', 'tablename')

See SQL examples. It also deals with data schema changes, creates table and allows you to preprocess data. It leverages BULK INSERT so should be just as fast.

Upvotes: 0

user10191093
user10191093

Reputation: 91

To solve step 1, xp_dirtree can also be used to list all files and folders.

Keep in mind that it is an undocumented function. Security precautions must be considered. Intentionally crafted filenames could be an intrusion vector.

Upvotes: 0

SarangArd
SarangArd

Reputation: 1165

    --BULK INSERT MULTIPLE FILES From a Folder 

    --a table to loop thru filenames drop table ALLFILENAMES
    CREATE TABLE ALLFILENAMES(WHICHPATH VARCHAR(255),WHICHFILE varchar(255))

    --some variables
    declare @filename varchar(255),
            @path     varchar(255),
            @sql      varchar(8000),
            @cmd      varchar(1000)


    --get the list of files to process:
    SET @path = 'C:\Dump\'
    SET @cmd = 'dir ' + @path + '*.csv /b'
    INSERT INTO  ALLFILENAMES(WHICHFILE)
    EXEC Master..xp_cmdShell @cmd
    UPDATE ALLFILENAMES SET WHICHPATH = @path where WHICHPATH is null


    --cursor loop
    declare c1 cursor for SELECT WHICHPATH,WHICHFILE FROM ALLFILENAMES where WHICHFILE like '%.csv%'
    open c1
    fetch next from c1 into @path,@filename
    While @@fetch_status <> -1
      begin
      --bulk insert won't take a variable name, so make a sql and execute it instead:
       set @sql = 'BULK INSERT Temp FROM ''' + @path + @filename + ''' '
           + '     WITH ( 
                   FIELDTERMINATOR = '','', 
                   ROWTERMINATOR = ''\n'', 
                   FIRSTROW = 2 
                ) '
    print @sql
    exec (@sql)

      fetch next from c1 into @path,@filename
      end
    close c1
    deallocate c1


    --Extras

    --delete from ALLFILENAMES where WHICHFILE is NULL
    --select * from ALLFILENAMES
    --drop table ALLFILENAMES

Upvotes: 57

Related Questions