Reputation: 99
df <- data.frame(category = c("A","B","A","D","E"),
date = c("5/10/2005","6/10/2005","7/10/2005","8/10/2005","9/10/2005"),
col1 = c(1,NA,2,NA,3),
col2 = c(1,2,NA,4,5),
col3 = c(2,3,NA,NA,4))
I have to insert a data frame that is created in R to mysql server.
I have tried these methods(Efficient way to insert data frame from R to SQL). However, my data also has NA which are fails the whole process of exporting.
Is there a way around to faster upload to data.
dbWriteTable(cn,name ="table_name",value = df,overwrite=TRUE, row.names = FALSE)
The above works but is very slow to upload
The method that I have to use is this :
before = Sys.time()
chunksize = 1000000 # arbitrary chunk size
for (i in 1:ceiling(nrow(df)/chunksize)) {
query = paste0('INSERT INTO dashboard_file_new_rohan_testing (',paste0(colnames(df),collapse = ','),') VALUES ')
vals = NULL
for (j in 1:chunksize) {
k = (i-1)*chunksize+j
if (k <= nrow(df)) {
vals[j] = paste0('(', paste0(df[k,],collapse = ','), ')')
}
}
query = paste0(query, paste0(vals,collapse=','))
dbExecute(cn, query)
}
time_chunked = Sys.time() - before
Error Encountered:
Error in .local(conn, statement, ...) :
could not run statement: Unknown column 'NA' in 'field list'
Upvotes: 1
Views: 273
Reputation: 521239
One of the fastest ways to load data into MySQL is to use its LOAD DATA
command line tool. You may try first writing your R data frame to a CSV file, then using MySQL's LOAD DATA
to load it:
write.csv(df, "output.csv", row.names=FALSE)
Then from your command line, use:
LOAD DATA INFILE 'output.csv' INTO TABLE table_name
FIELDS TERMINATED BY ',' ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES;
Note that this assumes the CSV file is already on the same machine as MySQL. If not, and you have it still locally, then use LOAD DATA LOCAL INFILE
instead.
You may read MYSQL import data from csv using LOAD DATA INFILE for more help using LOAD DATA
.
Edit:
To deal with the issue of NA
values, which should represent NULL
in MySQL, you may take the approach of first casting the entire data frame to text, and then replacing the NA
values with empty string. LOAD DATA
will interpret a missing value in a CSV column as being NULL
. Consider this:
df <- data.frame(lapply(df, as.character), stringsAsFactors=FALSE)
df[is.na(df)] <- ""
Then, use write.csv
along with LOAD DATA
as described above.
Upvotes: 1