Eliza Romanski
Eliza Romanski

Reputation: 93

Using foreign characters in R data frame

i tried to import some data(csv file) to R but it is in Hebrew and sadly the text is transformed to this for example : ׳¨׳׳™׳“׳” ׳₪׳¡׳™׳›׳™׳׳˜׳¨׳™׳” ׳׳ ׳¢׳¦׳׳׳™ 43.61 3 ׳™׳¢׳¨׳™ ׳׳‘׳™׳‘ ׳₪׳¡׳™׳›׳™׳׳˜׳¨׳™׳” ׳׳ ׳¢׳¦׳׳׳™ 45.00 4 ׳׳’׳¨׳‘ ׳׳ ׳˜׳•׳ ׳₪׳¡׳™׳›׳™׳׳˜׳¨׳™׳” ׳׳ ׳¢׳¦

what can i do to keep the hebrew text ? thank you :)

Upvotes: 0

Views: 259

Answers (1)

red_quark
red_quark

Reputation: 1001

For reading csv files with Hebrew characters, you can use readr package, which is a part of the tidyverse package. This package has a lot of utilities for language encoding and localization like guess_encoding and locale. Try code below:

install.packages("tidyverse")
library(readr)
locale("he")
guess_encoding(file = "path_to_your_file", n_max = 10000, threshold = 0.2) //replace with your data
df <- read_csv(file = "path_to_your_file", locale = locale(date_names = "he", encoding = "UTF-8")) //replace with your data

guess_encoding will help you to determine which encoding is more optimal for your file (for example, UTF-8, ISO 8859-8, Windows-1255, etc.); this function calculates the probability of a file of being encoded in several encodings. You should use the encoding with the highest probability.

Upvotes: 2

Related Questions