Reputation: 607
While using RMySQL::dbWriteTable function in R to write a table to MySQL on Windows I get an error message concerning the character [ñ].
The simplified example is:
table <- data.frame(a=seq(1:3), b=c("És", "España", "Compañía"))
table
a b
1 1 És
2 2 España
3 3 Compañía
db <- dbConnect(MySQL(), user = "####", password = "####", dbname ="test", host= "localhost")
RMySQL::dbWriteTable(db, name="test1", table, overwrite=T, append=F )
Error in .local(conn, statement, ...) :
could not run statement: Invalid utf8 character string: 'Espa'
As you can see, there is no problem with the accents ("És") but there is with the ñ character ("España").
On the other hand, there is no problem with MySQL since this query works fine:
INSERT INTO test.test1 (a,b)
values (1, "España");
Things I have already tried previous to write the table:
Encoding(x) <- "UTF-8" for all table.
iconv(x, "UTF-8", "UTF-8") for all table.
Sent pre-query: dbSendQuery(db, "SET NAMES UTF8;")
Change MySQL table Collation to: "utf-8-general, latin-1, latin-1-spanish...)
*Tried "Latin-1" encoding and didn't work either.
I have been looking for an answer to this question for a while with no luck.
Please help!
Versions:
MySQL 5.7.17
R version 3.3.0
Sys.getlocale()
[1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=C"
PS: Works fine in Linux environment but I am stuck with Windows in my current project :(
Upvotes: 4
Views: 1265
Reputation: 81
The function dbConnect() has a parameter called encoding that can help you easily setup the connection encoding method.
dbConnect(MySQL(), user=user, password=password,dbname=dbname, host=host, port=port, encoding="latin1")
This has allowed me to insert "ñ" characters into my tables and also inserting data into columns that have "ñ" in their name. For example, I can insert data into a column named "año".
Upvotes: 0
Reputation: 1
I ran into this problem with a data table of about 60 columns and 1.5 million rows; there were many computed values and reconciled and corrected dates and times so I didn't want to reformat anything I didn't have to reformat. Since the utf-8 issue was only coming up in character fields, I used a kludgy-but-quick approach:
1) copy the field list from the dbWriteTable
statement into a word processor or text editor
2) on your copy, keep only the fields that have descriptions as VARCHAR and TEXT
3) strip those fields down to just field names
4) use paste0 to write a character vector of statements that will ensure all the fields are character fields:
dt$x <- as.character(dt$x)
5) then use paste0 again to write a character vector of statements that set the encoding to UTF-8
Encoding(dt$x) <- "UTF-8"
Run the as.character
group before the Encoding
group.
It's definitely a kludge and there are more elegant approaches, but if you only have to do this now and then (as I did), then it has three advantages:
1) it only changes what needs changing (important when, as with my project, there is a great deal of work already in the data table that you don't want to risk in a reformat),
2) it doesn't require a lot of space and read/writes in the intermediate stage, and
3)it's fast to write and runs at an acceptable speed for at least the size of data table I'm working with.
Not elegant, but it will get you over this particular hitch very quickly.
Upvotes: 0
Reputation: 126
At the end, it looks like it is a problem of the encoding setup of the connection. By default my connection was setup to utf-8 but my local encoding was setup to latin1. Therefore, my final solution was:
con <- dbConnect(MySQL(), user=user, password=password,dbname=dbname, host=host, port=port)
# With the next line I try to get the right encoding (it works for Spanish keyboards)
encoding <- if(grepl(pattern = 'utf8|utf-8',x = Sys.getlocale(),ignore.case = T)) 'utf8' else 'latin1'
dbGetQuery(con,paste("SET names",encoding))
dbGetQuery(con,paste0("SET SESSION character_set_server=",encoding))
dbGetQuery(con,paste0("SET SESSION character_set_database=",encoding))
dbWriteTable( con, value = dfr, name = table, append = TRUE, row.names = FALSE )
dbDisconnect(con)
Upvotes: 1
Reputation: 126
This works for me in Windows:
write.csv(table, file = "tmp.csv", fileEncoding = "utf8", quote = FALSE, row.names = FALSE)
db <- dbConnect(MySQL(), user = "####", password = "####", dbname ="test", host= "localhost")
dbWriteTable( db, value = "tmp.csv", name = "test1", append = TRUE, row.names = FALSE, sep = ",", quote='\"', eol="\r\n")
Upvotes: 0