Reputation: 241
I am importing several large CSV files. In the process, I discovered two important pieces of information:
Upvotes: 0
Views: 1521
Reputation: 241
Seeing as this question has come up many times with several different answers, and few of the answers are limited to SQL query only, so I am writing the following:
CREATE TABLE #table1
(
data varchar(max)
)
BULK
INSERT table1
FROM 'C:\Import\file.csv'
WITH
(
FIRSTROW = 1,
FIELDTERMINATOR = 'somethingthatdoesntexist',
ROWTERMINATOR = '0x0A'
)
GO
ALTER TABLE table1 ADD id numeric NOT NULL IDENTITY(1,1)
ALTER TABLE report_table ADD CONSTRAINT
pk_id PRIMARY KEY CLUSTERED
(
id
) WITH (
statistics_norecompute = off,
ignore_dup_key = off,
allow_row_locks = on,
allow_page_locks = on
)
ON [PRIMARY]
From there, you can use an appropriate ideology to construct where your headers are. The first row (or first X rows) should be headers, and then the first row(s) after an empty row. Then check for an existing table that matches or create a new table, and import that data until the next empty row. Lather, rinse, repeat.
-- Create Processed(FALSE). Set to TRUE when properly imported.
ALTER TABLE table1 ADD processed bit NOT NULL
CONSTRAINT c_is_processed_default_no DEFAULT FALSE
-- Create Flagged(FALSE). Set to TRUE when manual review required.
-- Final step: set Flag = 1 where processed = 0; SELECT * FROM table1 WHERE flag = 1
ALTER TABLE table1 ADD flag bit NOT NULL
CONSTRAINT c_manual_review_default_no DEFAULT FALSE
Fair warning, the CSV I am importing is quite large, so I'm having to work through memory errors. Currently, I am working around this by executing each step in a separate batch.
Upvotes: 1