Reputation: 3325
I'm trying to import a very large .csv file (~4gb) into mysql. I was considering using phpmyadmin, but then you have a max upload size of 2mb. Someone told me that I have to use the command line.
I was going to use these directions to import it: http://dev.mysql.com/doc/refman/5.0/en/mysqlimport.html#c5680
What would be the command to set the first row in the .csv table as the column names in the mysql table? This option is available through phpmyadmin, so their must be a mysql command line version too, right?. Please help me. Thank you.
-Raj
Upvotes: 72
Views: 242500
Reputation: 641
Most of the answers above are correct and revolve around uploading the data using terminal with local_infile
but the problem with this approach is that if you are having shared hosting and phpMyAdmin instance then you might stuck with below where your shared hosting provider won't let you change the local_infile
settings.
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| local_infile | OFF |
+---------------+-------+
In order to get a workaround solution where I had to insert about 200,000 rows in the db. I wrote below shell script which did the job. You can increase or decrease the BATCH_SIZE
as per your use case.
#!/bin/bash
# MySQL credentials
DB_HOST="host"
DB_USER="db_user"
DB_PASS="db_pass"
DB_NAME="db_name"
TABLE_NAME="table_name"
# Path to the CSV file
CSV_FILE="data.csv"
# Field Separator (comma in this case)
IFS=','
# Batch size
BATCH_SIZE=1000
counter=0
SQL_BATCH="INSERT INTO $TABLE_NAME (sub_category, product_name, product_composition, product_price, product_manufactured, product_desc, product_usp, product_interactions) VALUES "'),"
# Read CSV file line by line
while read -r sub_category product_name product_composition product_price product_manufactured product_desc product_usp product_interactions; do
# Escape single quotes to prevent SQL syntax errors
sub_category=$(echo "$sub_category" | sed "s/'/''/g")
product_name=$(echo "$product_name" | sed "s/'/''/g")
product_composition=$(echo "$product_composition" | sed "s/'/''/g")
product_price=$(echo "$product_price" | sed "s/'/''/g")
product_manufactured=$(echo "$product_manufactured" | sed "s/'/''/g")
product_desc=$(echo "$product_desc" | sed "s/'/''/g")
product_usp=$(echo "$product_usp" | sed "s/'/''/g")
product_interactions=$(echo "$product_interactions" | sed "s/'/''/g")
# Append the current row values to the SQL batch
SQL_BATCH="$SQL_BATCH ('$sub_category', '$product_name', '$product_composition', '$product_price', '$product_manufactured', '$product_desc', '$product_usp', '$product_interactions'),"
# Increment the counter
((counter++))
# If we have reached the batch size, execute the SQL
if [[ $counter -eq $BATCH_SIZE ]]; then
# Remove the last comma and add a semicolon to complete the SQL statement
SQL_BATCH="${SQL_BATCH%,};"
# Execute the batch insert
mysql -h "$DB_HOST" -u "$DB_USER" -p"$DB_PASS" -D "$DB_NAME" -e "$SQL_BATCH"
# Reset the batch and counter
SQL_BATCH="INSERT INTO $TABLE_NAME (sub_category, product_name, product_composition, product_price, product_manufactured, product_desc, product_usp, product_interactions) VALUES "
counter=0
fi
done < "$CSV_FILE"
# Execute the remaining records if there are any
if [[ $counter -gt 0 ]]; then
# Remove the last comma and add a semicolon
SQL_BATCH="${SQL_BATCH%,};"
# Execute the remaining batch
mysql -h "$DB_HOST" -u "$DB_USER" -p"$DB_PASS" -D "$DB_NAME" -e "$SQL_BATCH"
fi
echo "Data import complete."
This workaround solution might take some time in case of large data but does the job.
Upvotes: 0
Reputation: 2790
Most answers missing an important point like if you have created csv
file exported from Microsoft Excel on windows and importing the same in linux environment, you will get unexpected result.
the correct syntax would be
load data local infile 'file.csv' into table table fields terminated by ',' enclosed by '"' lines terminated by '\r\n'
here the difference is '\r\n'
as against simply '\n
Upvotes: 0
Reputation: 7487
I know this says command line, but just a tidbit of something quick to try that might work, if you've got MySQL workbench and the csv isn't too large, you can simply
I say hope for the best because this is MySQL Workbench. You never know when it's going to explode
If you want to do this on a remote server, you would do
mysql -h<server|ip> -u<username> -p --local-infile bark -e "LOAD DATA LOCAL INFILE '<filename.csv>' INTO TABLE <table> FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'"
Note, I didn't put a password after -p
as putting one on the command line is considered bad practice
Upvotes: 0
Reputation: 43
Another option is to use the csvsql command from the csvkit library.
Example usage directly on command line:
csvsql --db mysql:///test --tables yourtable --insert yourfile.csv
This can be executed directly on the command line, or built into a python or shell script for automation if you need to do this for a number of files.
csvsql allows you to create database tables on the fly based on the structure of your csv, so it is a lite-code way of getting the first row of your csv to automagically be cast as the MySQL table header.
Full documentation and further examples here: https://csvkit.readthedocs.io/en/1.0.3/scripts/csvsql.html
Upvotes: 1
Reputation: 1861
You can simply import by
mysqlimport --ignore-lines=1 --lines-terminated-by='\n' --fields-terminated-by=',' --fields-enclosed-by='"' --verbose --local -uroot -proot db_name csv_import.csv
Note: Csv File name and Table name should be same
Upvotes: 10
Reputation: 171
You could do a
mysqlimport --columns='head -n 1 $yourfile' --ignore-lines=1 dbname $yourfile`
That is, if your file is comma separated and is not semi-colon separated. Else you might need to sed
through it too.
Upvotes: 16
Reputation: 331
try this:
mysql -uusername -ppassword --local-infile scrapping -e "LOAD DATA LOCAL INFILE 'CSVname.csv' INTO TABLE table_name FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'"
Upvotes: 23
Reputation: 6867
Try this command
load data local infile 'file.csv' into table table
fields terminated by ','
enclosed by '"'
lines terminated by '\n'
(column1, column2, column3,...)
The fields here are the actual table fields that the data needs to sit in. The enclosed by and lines terminated by are optional and can help if you have columns enclosed with double-quotes such as Excel exports, etc.
For further details check the manual.
For setting the first row as the table column names, just ignore the row from being read and add the values in the command.
Upvotes: 157
Reputation: 16314
For importing csv with a header row using mysqlimport, just add
--ignore-lines=N
(ignores the first N lines of the data file)
This option is described in the page you've linked.
Upvotes: 6