Reputation: 268
I have a CSV file that follows this pattern:
cust_id,cust_name,cust_add_1,cust_add_2,cust_city,cust_state,cust_zip,cust_email
100024,BALE #DIANA & ROY,2944 SOME RD, ,AKRON,OH,44556,[email protected]
100139,SMITH #JOHN & LINDA,1569 ANOTHER WAY, ,SARASOTA,FL,65478,
100263,DOLE #BOB,5947 LONG RD, ,GRANITE FALLS,NC,12345,
.
.
. continued
This file represents thousands and thousands of rows of data, and I am trying to insert all of it into a MySQL table. I created the corresponding table with the following statement:
CREATE TABLE customer (
cust_id INT(7) NOT NULL,
cust_name VARCHAR(40) NOT NULL,
cust_add_1 VARCHAR(50) NOT NULL,
cust_add_2 VARCHAR(50),
cust_city VARCHAR(20) NOT NULL,
cust_state CHAR(2) NOT NULL,
cust_zip INT(5) NOT NULL,
cust_email VARCHAR(60),
PRIMARY KEY (cust_id)
);
Then I tried the following on the command line:
msyqlimport -u root --local csv_import ~/path/to/customer.csv
When I tried SELECT * FROM customer
after running mysqlimport, I was given the following result-set:
+---------+-----------+------------+------------+-----------+------------+----------+------------+
| cust_id | cust_name | cust_add_1 | cust_add_2 | cust_city | cust_state | cust_zip | cust_email |
+---------+-----------+------------+------------+-----------+------------+----------+------------+
| 0 | | | NULL | | | | NULL |
+---------+-----------+------------+------------+-----------+------------+----------+------------+
1 row in set (0.00 sec)
I'm not very experienced in the area of importing CSV data into MySQL, and any help would be greatly appreciated.
UPDATE
As per ThisSuitIsBlackNot's answer, I tried the following command (BTW the --local
option is needed):
mysqlimport -u root --local --fields-terminated-by=',' --fields-optionally-enclosed-by='"' --lines-terminated-by='\n' csv_import ~/path/to/customer.csv
This produced slightly better results but wasn't what I need, so I tried it like this:
mysql> LOAD DATA LOCAL INFILE '~/path/to/customer.csv'
-> INTO TABLE customer
-> FIELDS TERMINATED BY ','
-> OPTIONALLY ENCLOSED BY '"'
-> LINES TERMINATED BY '\n';
After both of these, SELECT * FROM customer
produced this result-set:
+---------+-----------+------------+------------+-----------+------------+----------+-------------------+
| cust_id | cust_name | cust_add_1 | cust_add_2 | cust_city | cust_state | cust_zip | cust_email |
+---------+-----------+------------+------------+-----------+------------+----------+-------------------+
100024 |0 | cust_name | cust_add_1 | cust_add_2 | cust_city | cu | cust_ | cust_email
+---------+-----------+------------+------------+-----------+------------+----------+-------------------+
1 row in set (0.00 sec)
This is obviously not what I need, but it's getting closer.
Upvotes: 1
Views: 2752
Reputation: 24063
The default field separator for mysqlimport
is a tab character. You need to tell mysqlimport
to split fields on commas instead:
mysqlimport --fields-terminated-by=',' ...
This isn't quite enough, though, because if your CSV follows the spec, fields containing commas will be surrounded by double quotes. You need to tell mysqlimport
that it shouldn't treat commas inside quotes as field separators:
mysqlimport --fields-terminated-by=',' --fields-optionally-enclosed-by='"' ...
In some CSVs, all fields are double quoted, whether they contain commas or not. This is not the case with your data, but for future reference, mysqlimport
also provides the option
--fields-enclosed-by=string
I'm not sure, but I assume performance would be better on a CSV where all fields are quoted if you use --fields-enclosed-by='"'
instead of --fields-optionally-enclosed-by='"'
.
The default line separator for mysqlimport
is a newline \n
. This should work for files generated on UNIX/Linux/Mac OS X, but if the EOL sequence in your file is something different (for example, \r\n
in most files created on Windows), you need to specify that as well:
mysqlimport --lines-terminated-by='\r\n'
Apparently, Wordpad uses '\r' as a line ending, as did Macs prior to Mac OS X.
Generally you don't want to import the names of your columns into your table, since the table already has its own column names. You can skip the first X number of lines in your file like this:
mysqlimport --ignore-lines=X
To import a CSV file where
\r\n
you would run
mysqlimport --fields-terminated-by=',' \
--fields-optionally-enclosed-by='"' \
--lines-terminated-by='\r\n' \
--ignore-lines=1 \
db_name table_name.csv
Note that mysqlimport
determines the table to import into by simply stripping off the file extension, so the above example will try to import the data from table_name.csv
into the table_name
table. Also, if you're running this on the server, you shouldn't need the --local
option, although the wording in the documentation is a little obtuse.
You can also use the equivalent SQL statement
LOAD DATA INFILE 'table_name.csv' INTO TABLE table_name
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES;
(mysqlimport
is just a command-line interface to LOAD DATA INFILE
)
See the docs for LOAD DATA INFILE
for more details.
Upvotes: 2