character encoding, dplyr with database (postgresql)

Question

I've read the threads and package updates for encoding issues with Shiny, but I have a (difficult-to-reproduce example) database-driven Shiny app which is fumbling some special characters.

In my postgresql database I see correctly my Swedish river, "Upper Umeälven River", which - when I filter it back to the Shiny interface with dplyr: names.rivers <- filter(tbl.rivers, Country == "Sweden") ...becomes "Upper UmeÃ¤lven River" in R.

I'm using UTF-8 encoding locally; I guess I'm losing something on the exchange with the database.

Sys.getlocale() [1] "LC_COLLATE=French_France.1252;LC_CTYPE=French_France.1252;LC_MONETARY=French_France.1252;LC_NUMERIC=C;LC_TIME=French_France.1252"

Apologies again for the lack of example, it's ONLY an issue pulling from the database. I suspect I'm missing a flag on some sanitizing function someplace, but need some help getting pointed the right direction.

Jeff · Accepted Answer

As suspected, the answer was simple: iconv(vector.to.convert, "UTF-8")

My "learnings":

Encodings of the source file, the database, and data streams are not the same thing;
I spent time making sure the data sources had been created in the correct encoding, ignoring the (implicit?) conversion of the datastream;
This page helped: http://shiny.rstudio.com/articles/unicode.html

My understanding is a bit shallow, but - frankly - I'm not digging deeper into the world of character encoding for the moment. I hope it helps someone else avoid the error!

character encoding, dplyr with database (postgresql)

Answers (2)

Related Questions