Reputation: 4965
I am using the RODBC
package to read data from SQL server. R is reading the Chinese characters as "?????"
I have passed the parameter DBMSencoding = "UTF-8"
to the odbcConnect
function.
Following is the sample code I am using:
Connection <- odbcConnect("abc", uid = "123", pwd = "123",
DBMSencoding = "UTF-8", readOnlyOptimize=T)
Var1 <- sqlQuery(Connection, query, errors = TRUE, stringsAsFactors=F)
May be I didn't pass the arguments the way I am supposed to?
sessionInfo()
R version 3.2.3 (2015-12-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] RODBC_1.3-12
loaded via a namespace (and not attached):
[1] tools_3.2.3
odbcGetInfo(mainConnection)
DBMS_Name DBMS_Ver Driver_ODBC_Ver Data_Source_Name Driver_Name
"Microsoft SQL Server" "10.50.4000" "03.52" "SQLSRV32.DLL"
Driver_Ver ODBC_Ver Server_Name
"06.01.7601" "03.80.0000"
Upvotes: 2
Views: 6940
Reputation: 335
I got the same problem and successfully solved it. It was quite simple. Go to Control Panel --> Region and Language --> Administrative --> Change system locale --> Chinese.
Upvotes: 0
Reputation: 21
Check the database's character encoding:
select userenv('language') from dual;
SIMPLIFIED CHINESE_CHINA.AL32UTF8
Change your Environment Variable NLS_LANG
before connecting to the database:
Sys.setenv(NLS_LANG="SIMPLIFIED CHINESE_CHINA.AL32UTF8")
Connection <- odbcConnect("abc", uid = "123", pwd = "123", DBMSencoding = "UTF-8", readOnlyOptimize=T)
Upvotes: 2
Reputation: 2543
R on Windows has a lot of problems displaying characters outside of ASCII, even though it is often faithfully representing them internally. There is a lot of information in this answer about why this is the case, and some simple diagnostics in this answer. First try plotting, like:
# first, make sure plotting Chinese works in general
# (i.e., you have an appropriate font)
hanzi <- "漢字"
plot(1, 1, type="n")
text(1, 1, hanzi)
If that works, replace the hanzi <- "漢字"
line with your sql query line to get some Chinese text from your database into a string variable, and try plotting that. If it shows up on the plot, then the characters are being read fine and represented internally fine, and the problem is just displaying them in the console. If plotting worked for the "漢字" string variable but doesn't work for your SQL-extracted string, then at least you know that the problem is actually with the SQL part and not just with display in the console.
Upvotes: 1