nbayly
nbayly

Reputation: 2167

CSV imports with no errors but missing rows

I am loading a csv file into an empty table with success and no errors. When comparing the number of original rows (from viewing .csv in external application and from the Output Response) with the number of rows in my resulting table (from Table Inspector) seems like not all rows are importing. Table Inspector reports that the table has 1,416,824 rows while the original csv has 1,419,910 rows. There should be no replicated primary keys in the data set though it should have error'd out on those lines in my mind if that were the case.

Table structure:

CREATE TABLE `table1` (
  `pkfield` varchar(10) NOT NULL,
  `field1` varchar(3) DEFAULT NULL,
  `field2` varchar(1) DEFAULT NULL,
  `field3` varchar(1) DEFAULT NULL,
  PRIMARY KEY (`pkfield`),
  UNIQUE KEY `pkfield_UNIQUE` (`pkfield`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Import command:

LOAD DATA INFILE 'c:/table1.csv'
INTO TABLE myschema.table1
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\r\n';

MySQL Workbench Response:

1419910 row(s) affected Records: 1419910  Deleted: 0  Skipped: 0  Warnings: 0

Sample from csv file (data mocked up):

6623950258,XYZ,A,B
5377103432,XYZ,A,B
9131144416,XYZ,A,A
1326703267,XYZ,A,B
7847786312,XYZ,A,B
1119927042,XYZ,B,B
4144055385,CDE,A,B
4563489252,XYZ,A,B
5733611912,XYZ,A,B
3309418377,XYZ,A,B
6928148128,XYZ,A,B
1152657670,XYZ,A,B
8143082292,CDE,A,B
9373340750,XYZ,A,A
3318949288,XYZ,A,B
1166427684,XYZ,A,B
5062296807,XYZ,B,A
4624323293,XYZ,A,B
3088992643,CDE,A,B
6477504847,XYZ,A,B

Any suggestions or explanations would be greatly appreciated. Kind regards.

Upvotes: 2

Views: 7270

Answers (2)

Adeel Raza Azeemi
Adeel Raza Azeemi

Reputation: 793

Honestly, for now, I didn't know why this happened. But I knew a solution to this. At first, I thought It might be because of special characters present in the csv records. but even after removing those still, I was receiving not all records. I also noticed that records don't need to be in thousand to be missed by load data. It even happens in a few hundred records.

So for now the only reliable solution is to import using phpMyAdmin. Please remove special characters etc before importing, and also remove new lines from fields header etc.

phpMyAdmin seems to have some sort of parser that tokenized the csv file and then create SQL from those tokens. So it does not use at all load data command and hence it did import corrects.

The downside is it is through GUI and all those slowness that comes with point and click etc.

Upvotes: 0

PerlDuck
PerlDuck

Reputation: 5720

Honestly, I'm myself not sure why the number of rows isn't accurate after a fresh import of a table. I think the Table Inspector fetches the data from some statistics table and to my understanding that gets updated only when the table changes by more than 10 %. Perhaps this is the reason. However, the accurate amount of rows can always be fetched with traditional

select count(*) from myschema.table1;

As @nbayly told, this gives the expected result of 1419910 rows which matches the number LOAD DATA had reported.

Upvotes: 3

Related Questions