Reputation: 5563
I have a csv file including chinese character saved with UTF-8.
项目 价格 电视 5000
The first row is header, the second row is data. In other words, it is one by two vector.
I read this the file as follows:
amatrix<-read.table("test.csv",encoding="UTF-8",sep=",",header=T,row.names=NULL,stringsAsFactors=FALSE)
However, the output including the unknown marks for the header, i.e.,X.U.FEFF
Upvotes: 0
Views: 721
Reputation: 57696
That is the byte order mark sometimes found in Unicode text files. I'm guessing you're on Windows, since that's the only popular OS where files can end up with them.
What you can do is read the file using readLines
and remove the first two characters of the first line.
txt <- readLines("test.csv", encoding="UTF-8")
txt[1] <- substr(txt[1], 3, nchar(txt[1]))
amatrix <- read.csv(text=txt, ...)
Upvotes: 1