John
John

Reputation: 1828

R print UTF-8 code in data.frames on Windows platform Rstudio

When there are UTF-8 characters in the data frame, it won't be displayed properly.

For example, the following is correct:

> "\U6731"
[1] "朱"

But when I put that in a data frame and have it printed, here it is:

> data.frame(x="\U6731")
         x
1 <U+6731>

Hence I believe this has nothing to do with encoding issues.

Is there any direct way to print instead of <U+6731>.

I have to use Windows in company so using Linux might not be feasible for me.

Upvotes: 4

Views: 1722

Answers (2)

Patrick Perry
Patrick Perry

Reputation: 1482

The corpus library has a work-around for this bug. Either do this:

library(corpus)
df <- data.frame(x = "\U6731")
print.corpus_frame(df)

Or else do this:

class(df) <- c("corpus_frame", "data.frame")
df

Upvotes: 3

Alex Knorre
Alex Knorre

Reputation: 630

You are right, while calling the whole dataframe it will give codes for UTF-8 characters:

> data.frame(x="\U6731")
         x
1 <U+6731>

But if you call for columns or rows, it would print nicely:

# through the column name
> data.frame(x="\U6731")$x
[1] 朱
Levels: 朱

# through the column index
> data.frame(x="\U6731")[,1]
[1] 朱
Levels: 朱

# through the row index
> data.frame(x="\U6731")[1,]
[1] 朱
Levels: 朱

Not sure if this helps. Could you be more specific why and how exactly you need to output these characters?

Upvotes: 1

Related Questions