Reputation: 39613
I am working with a dataframe in R
that has some issues about encoding of strings. My dataframe df
looks like this:
df
title
1 José Francisco Salgado - Executive Director and Founder
2 José Francisco Salgado - Executive Director and Founder
The issue is that strings should have accents where strange symbols are present. I tried next solution:
#Code
df$title <- iconv(df$title,"UTF-8","latin1")
But it is not working as I get same strings with weird symbols. I do not know why this is not working because when I try this it does the job:
#Code2
iconv("José Francisco Salgado - Executive Director and Founder","UTF-8","latin1")
[1] "José Francisco Salgado - Executive Director and Founder"
Setting accents for the strings. How can I solve this issue and have this:
df
title
1 José Francisco Salgado - Executive Director and Founder
2 José Francisco Salgado - Executive Director and Founder
Many thanks.
This is the dput()
version of df
:
#Data
df <- structure(list(title = c("José Francisco Salgado - Executive Director and Founder",
"José Francisco Salgado - Executive Director and Founder")), row.names = 1:2, class = "data.frame")
Upvotes: 2
Views: 978
Reputation: 31
I had a similar issue. I was accessing an API which the returned the data with correct diacritics. However, using rawToChar()
messed it up. So I used instead content()
with the data received from the httr::GET
.
Ex (I wanted the value for name):
raw_data <- httr::GET(API_URL)
content(raw_data)$name
Upvotes: 0
Reputation: 19163
One solution to the problem is to work on just one string at a time:
data.frame( sapply( df, function(x) iconv( x, "UTF-8", "LATIN1" ) ) )
title
1 José Francisco Salgado - Executive Director and Founder
2 José Francisco Salgado - Executive Director and Founder
Upvotes: 2