Reputation: 162
I'm trying to export a data frame with Arabic text in R.
When R imports Arabic text it converts it to UTF-8 codes. Like this:
<U+0627><U+0644><U+0641><U+0631><U+0639> <U+0627><U+0644><U+062A><U+0634><U+0631><U+064A><U+0639><U+064A><U+060C> <U+0627><U+0644><U+0641><U+0631><U+0639> <U+0627><U+0644><U+062A><U+0646><U+0641><U+064A><U+0630><U+064A><U+060C><U+0627><U+0644><U+0641><U+0631><U+0639> <U+0627><U+0644><U+0642><U+0636><U+0627><U+0626><U+064A>. <U+0627><U+0644><U+062D><U+0643><U+0648><U+0645><U+0629> <U+0627><U+0644><U+0641><U+062F><U+0631><U+0627><U+0644><U+064A>
Unfortunately, I can't get it to turn back into readable Arabic when exporting. Below is code I'm using...
write.csv(,"data.csv", fileEncoding='UTF-8')
Anybody have a solution?
Also, here is my session info.
R version 3.0.1 (2013-05-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ggplot2_0.9.3.1
loaded via a namespace (and not attached):
[1] colorspace_1.2-2 dichromat_2.0-0 digest_0.6.3 grid_3.0.1 gtable_0.1.2
[6] labeling_0.2 MASS_7.3-27 munsell_0.4.2 plyr_1.8 proto_0.3-10
[11] RColorBrewer_1.0-5 reshape2_1.2.2 scales_0.2.3 stringr_0.6.2 tools_3.0.1
Upvotes: 1
Views: 3480
Reputation: 1
This is 10 years too late but I faced a similar issue and decided to post my solution in case it helps someone else in the future.
My solution was to use the package 'openxlsx'.
First, you'll need to install the openxlsx package if you haven't already:
Then, you can modify your code to write an xlsx file instead of csv with UTF-8 Encoding like this:
write.xlsx(,"data.xlsx", rowNames = FALSE, showNA = FALSE, encoding = "UTF-8")
Hope this helps someone!
Upvotes: 0
Reputation: 66
This code worked with me so I am sharing it:
Sys.setlocale("LC_CTYPE", "arabic" )
write.csv(group$message, file = 'posts.txt', fileEncoding = "UTF-8")
If you save the file as csv it will not work. You have to save it as txt.
Upvotes: 5
Reputation: 1929
You'll have to install and use locales. It's difficult and sometimes doesn't work.
There's some solutions and code offered here: Writing data isn't preserving encoding
Keep in mind that you actually HAVE to install language packs for your operating system and for some Windows versions there aren't any available separately at all.
Upvotes: 2