Łukasz Deryło
Łukasz Deryło

Reputation: 1860

Unicode in column names

Just like in title, I'd like to put some Unicode characters to column names of data frame.

Toy example:

df<-data.frame(x=1:3, y=4:6)
 
(nm<-c('x\u00b2', 'y\u2082'))
"x²" "y₂"
 
colnames(df)<-nm
 
df
  x2 y2
1  1  4
2  2  5
3  3  6

As you can see, sub- and supercripts are converted to "ordinary" digits.

One more try:

 (nm<-c('x\u03B2', 'y\u2082'))
 "xβ" "y₂"
 
 colnames(df)<-nm
 
 df
  xß y2
1  1  4
2  2  5
3  3  6

Now greek β is converted to german ß (despite my Windows locale is Poland...)

Finally, greek gamma seems to be left as it is:

 (nm<-c('x\u03B3', 'y\u2082'))
 "xγ" "y₂"
 
 colnames(df)<-nm
 
 df
  x<U+03B3> y2
1         1  4
2         2  5
3         3  6
Warning message:
In do.call(data.frame, c(x, alis)) :
  unable to translate 'x<U+03B3>' to native encoding

So, in general: is there a way to avoid converting Unicode characters to their "nearest neigbours"?

EDIT

I know that calling colnames(df) gives appropriate results:

    (nm<-c('x\u00b2', 'y\u2082'))
    "x²" "y₂"
     
    colnames(df)<-nm

    colnames(df)
    "x²" "y₂"

My goal is to get them from simple df or print(df) call.

Upvotes: 0

Views: 719

Answers (1)

Ian Campbell
Ian Campbell

Reputation: 24790

I got the characters to stick by bypassing colnames<-:

attr(df,"names") <- nm
print(df)
  xβ y₂
1  1  4
2  2  5
3  3  6

colnames(df)
[1] "xβ" "y₂"

Use at your own risk.

sessionInfo()
#R version 4.0.2 (2020-06-22)
#Platform: x86_64-apple-darwin17.0 (64-bit)
#Running under: macOS Catalina 10.15.7
#
#Matrix products: default
#BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecL#ib.framework/Versions/A/libBLAS.dylib
#LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
#
#locale:
#[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

Upvotes: 2

Related Questions