Reputation: 13
my Rstudio Project and System Setup is on UTF8
sadly my database contains some old legacy tables in latin1
i want to pull the latin table into my UTF8 workflow without killing all my öüä :D
some scripts later on in my workflow dont like mixed encoding.
### connect db
my_db <- dbPool(RMySQL::MySQL(),
%%%% my private DB-credentials %%%% )
### read data
my_latin<-my_db %>% dbGetQuery("SELECT * FROM legacy_sqltable;")
my_utf8<-my_db %>% dbGetQuery("SELECT * FROM new_sqltable;")
### join data
fulldata_in_utf8 <- full_join(my_latin,my_utf8)
Upvotes: 1
Views: 2079
Reputation: 2689
Alternative answer for MS SQL Server:
If you're using odbc
or DBI::dbConnect
to connect to your database, you simply need to specify the database encoding. Everything else should be automatic.
e.g.
con <-DBI::dbConnect(odbc::odbc(), DSN_name,
UID=<username>,
PWD=<password>,
Database=<database>,
encoding="latin1") # <---- Specify the database encoding here!
Upvotes: 4
Reputation: 142278
You would have to work harder to get latin1 bytes from some SELECTs
and utf8 bytes from other SELECTs
.
I don't happen to know the R way of establishing the client as using utf8 during the connection, but here is a way to do it after connecting:
rs <- dbSendQuery(con, 'SET NAMES utf8')
If you are JOINing
a tables via columns with different CHARACTER SETs
(or COLLATIONs
), it may work. (There are a few cases where it will spit at you, but they can be worked around.)
To do a blanket change of all the columns in a particular table:
ALTER TABLE mytable CONVERT TO CHARACTER SET utf8mb4;
Note: Outside MySQL, it is called UTF-8
. Inside MySQL it is either utf8
(for the 3-byte subset) or utf8mb4
(for the full UTF-8
set).
Upvotes: 0
Reputation: 93
You can adjust the encoding of a variable in R with the function iconv or Encoding.
Here is the link to the page. The following function works for changing the encoding of your dataframe to latin1 or just making sure that the r knows that is latin-1
Encoding(my_lating) <- "latin1"
If you want to successfully join them, I suggest to encode the data frames to the same encoding.
Upvotes: 0