Luciano Selzer
Luciano Selzer

Reputation: 10016

Use mutate_at to change multiple column types

I'm trying to use dplyr to tidy a dataset. The columns I want to change have a character string that's really a double but with comma instead of a decimal point. So far I got this:

presupuesto_2016 <- read_csv( "http://datos.gob.ar/dataset/89f1a2dd-ad79-4211-87b4-44661d81ac0d/resource/84e23782-7d52-4724-a4ba-2f9621fa5f4e/download/presupuesto-2016.csv")

names(presupuesto_2016) <- str_replace(names(presupuesto_2016), "\uFEFF", "")

presupuesto_2016 %>%
  mutate_at(starts_with("monto_"),
            str_replace, pattern = ",", replacement = "\\.") %>% 
  mutate_at(starts_with("monto_"), funs(as.numeric))

But this manages to change every column to numeric. What am I doing wrong here?

Upvotes: 8

Views: 16038

Answers (2)

akuiper
akuiper

Reputation: 214987

If you want to use mutate_at and column selection helper functions, they have to be wrapped in the vars function to work properly, take a look at ?mutate_at:

presupuesto_2016 %>%
  mutate_at(vars(starts_with("monto_")),
  #         ^^^ 
            str_replace, pattern = ",", replacement = "\\.") %>% 
  mutate_at(vars(starts_with("monto_")), funs(as.numeric))
  #         ^^^ 

Upvotes: 29

hrbrmstr
hrbrmstr

Reputation: 78822

Why not just do:

URL <- "http://datos.gob.ar/dataset/89f1a2dd-ad79-4211-87b4-44661d81ac0d/resource/84e23782-7d52-4724-a4ba-2f9621fa5f4e/download/presupuesto-2016.csv"
presupuesto_2016 <- read_csv(URL, locale=locale(decimal_mark=","))

Also, I'd suggest doing:

fil <- basename(URL)
if (!file.exists(fil)) download.file(URL, fil)
presupuesto_2016 <- read_csv(fil, locale=locale(decimal_mark=","))

to save on your and that site's bandwidth, speed up future processing and ensure reproducibility in the event that site goes offline or you do.

Upvotes: 6

Related Questions