Reputation: 335
I have an Rmd file encoded with UTF-8, but when I knit the file, R evaluated inline and chunk contents are missing some Czech characters. Everything is fine when I type the text outside of chunks. Reading the same text from a file, I can correctly produce the output inline, but not when using printing (print
or cat
) from within a chunk. I am completely confused about the situation, especially the cat
behaviour.
I am on Windows. Checking encoding in console returns UTF-8. Locale set to English_United Kingdom.1252.
---
title: "test"
output: html_document
---
```{r}
txt <- "Čeština funguje"
print(Encoding(txt))
print(txt) # prints incorrectly
```
Čeština funguje # prints correctly
`r txt` # prints incorrectly
```{r}
cat(txt) # prints incorrectly
```
```{r, results='asis'}
line <- readLines("line", encoding = "UTF-8")
print(Encoding(line))
print(line) # prints incorrectly
cat(line) # prints incorrectly
```
`r line` # prints correctly!
P.S. I know there has been a lot said about R and encoding on Windows, but despite my extensive searching I can't find a solution and don't fully understand this behaviour. I am guessing I need to set some locale, but my efforts so far have been in vain.
Upvotes: 3
Views: 366
Reputation: 30114
Before R supports UTF-8 natively on Windows, usually you have to set the locale to the specific language if you want to use multi-byte characters from this language, e.g., you need to use the Czech locale instead of English if you want to properly print()
/cat()
Czech characters. The locale needs to be set before knitting happens, e.g., you may set it in your ~/.Rprofile
:
Sys.setlocale(, 'Czech')
I have never used Czech before and am not sure if 'Czech'
is a proper value, but that's the idea (I have had success with other languages before).
Upvotes: 3