Reputation: 115
I have search around but have not found anything that has worked in bulk inserting records into a SQL table. I have tried different variation using characters, ASCII and HEX values without success. Each time generating an error. I usually perform alteration in Excel, but his file has over 5M records. This has to be possible, do anyone have a working solution or provide additional guidance ? Thank you in advance.
ERROR:
Msg 4866, Level 16, State 1, Line 110 The bulk load failed. The column is too long in the data file for row 1, column 1. Verify that the field terminator and row terminator are specified correctly. Msg 7399, Level 16, State 1, Line 110 The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error. Msg 7330, Level 16, State 2, Line 110 Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".
Sample File
SQL Command
BULK INSERT [dbo].[AllTags]
FROM 'C:\Data\Swap Drive\File to import\01. Document Export\REL000001-REL296747\SAMPLE.DAT'
WITH (ROWTERMINATOR='\n',
MAXERRORS=0 ,
FIELDTERMINATOR='þ' ,
TABLOCK ,
CodePage='RAW'
)
Upvotes: 0
Views: 1597
Reputation: 89051
It would have been helpful to post the correct table DDL, but here's a start:
drop table if exists #t
go
CREATE TABLE #t (
REFERENCEID varchar (100) ,
[BEGBATES] varchar (100) ,
[ENDBATES] varchar (100) ,
[BEGATTACH] varchar (100) ,
[ENDATTACH] varchar (100) ,
[PARENTBATES] varchar (100) ,
[ATTACHMENT] varchar (1000),
CUSTODIAN varchar(100),
DUPCUSTODIAN varchar(100),
[FROM] varchar(100),
[TO] varchar(100),
CC varchar(100),
BCC varchar(100),
SUBJECT varchar(100),
DATESENT varchar(100),
TIMESENT varchar(100),
DATERCVD varchar(100),
TIMERCVD varchar(100),
FILEEXT varchar(100),
AUTHOR varchar(100),
CREATEDATE varchar(100),
CREATETIME varchar(100),
DATELASTMOD varchar(100),
TIMELASTMOD varchar(100),
FILENAME varchar(100),
DUPFILENAME varchar(100),
FILELENGTH varchar(100),
PGCOUNT varchar(100),
DOCTYPE varchar(100),
FAMDATE varchar(100),
FAMTIME varchar(100),
TIMEZONE varchar(100),
PATH varchar(max),
DUPPATH varchar(max),
DEDUPHASH varchar(100),
NATIVEPATH varchar(100),
OCRPATH varchar(100),
TITLE varchar(100),
COMPANY varchar(100),
DATEACCESSED varchar(100),
TIMEACCESSED varchar(100),
DATEPRINTED varchar(100),
TIMEPRINTED varchar(100),
CONVDATE varchar(100),
CONVTIME varchar(100),
ATTACHLIST varchar(100),
FAMILYRANGE varchar(100),
ALLCUSTODIANS varchar(100),
ALLFILENAMES varchar(100),
ALLFILEPATHS varchar(max),
HASHMD5 varchar(100),
HASHSHA varchar(100),
TAGS varchar(100),
DOCNOTE varchar(100),
PRIVNOTE varchar(100),
REDACTRSNS varchar(100),
DISCOID varchar(100),
MESSAGEID varchar(100),
THREADID varchar(100),
ATTACHCOUNT varchar(100),
HIDDENTYPE varchar(100),
METAREDACTED varchar(100),
INREPLYTOID varchar(100),
OBJECTHASH varchar(100),
REVISION varchar(100),
HEADER varchar(100),
IMPORTANCE varchar(100),
DELIVERYRECEIPT varchar(100),
READRECEIPT varchar(100),
SENSITIVITY varchar(100),
LASTAUTHOR varchar(100),
ESUBJECT varchar(100),
DATEAPPTSTART varchar(100),
DATEAPPTEND varchar(100),
CALBEGDATE varchar(100),
CALENDDATE varchar(100),
CALBEGTIME varchar(100),
CALENDTIME varchar(100),
CALENDUR varchar(100),
RECORDTYPE varchar(100),
REVISIONNUMBER varchar(100),
Exception varchar(100),
ExceptionDetails varchar(100),
TextLimitExceeded varchar(100)
)
go
BULK INSERT #t
FROM 'C:\temp\test.DAT'
WITH (
FIRSTROW=2,
ROWTERMINATOR='0x0a',
MAXERRORS=0 ,
FIELDQUOTE =N'þ' ,
FIELDTERMINATOR = '0x14',
TABLOCK ,
CODEPAGE = '65001',
DATAFILETYPE = 'Char'
)
select * from #t
outputs
The UTF-8 þ (thorn) 0xC3BE used as a quote mark is not being stripped, and BULK INSERT doesn't support multi-byte FIELDQUOTE specified in binary.
So you'll have to strip that out, but it's a start.
BTW a good hex editor, like the one now available in VS Code is invaluable in cases like this.
Upvotes: 2