Derek Corcoran
Derek Corcoran

Reputation: 4082

Failed to use map2 with mutate with purrr and dplyr

I am reading a list of files form my computer and doing several transformations on them with purrr and dplyr, everything works great, but I have a vector with the IDs of each data frame created, and I want to add a column with the ID of data for each data frame.

Loading libraries
library(readr)
library(lubridate)
library(dplyr)
library(purrr)

Reading list of files to be read and modified

ArchivosTemp <- list.files(pattern = "Tem.csv")

For reproducible purposes

lets say the list of dataframes called Temperaturas made after the first line of the code is

Temperaturas <- list(structure(list(`Date/Time` = c("01-07-2016 14:55", "01-07-2016 15:55", 
"01-07-2016 16:55", "01-07-2016 17:55", "01-07-2016 18:55", "01-07-2016 19:55"
), Unit = c("C", "C", "C", "C", "C", "C"), Value = c(28L, 24L, 
25L, 25L, 25L, 25L), a = c(68L, 682L, 182L, 182L, 182L, 182L)), .Names = c("Date/Time", 
"Unit", "Value", "a"), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame")), structure(list(`Date/Time` = c("12-06-2016 19:44", 
"12-06-2016 20:44", "12-06-2016 21:44", "12-06-2016 22:44", "12-06-2016 23:44", 
"13-06-2016 0:44"), Unit = c("C", "C", "C", "C", "C", "C"), Value = c(31L, 
29L, 27L, 26L, 26L, 24L), a = c(129L, 131L, 632L, 633L, 133L, 
633L)), .Names = c("Date/Time", "Unit", "Value", "a"), row.names = c(NA, 
-6L), class = c("tbl_df", "tbl", "data.frame")), structure(list(
`Date/Time` = c("07-06-16 7:54:01", "07-06-16 8:54:01", "07-06-16 9:54:01", 
"07-06-16 10:54:01", "07-06-16 11:54:01", "07-06-16 12:54:01"
), Unit = c("C", "C", "C", "C", "C", "C"), Value = c(23L, 
19L, 25L, 27L, 30L, 34L), a = c("119", "116", "119", "119", 
"118", "113")), .Names = c("Date/Time", "Unit", "Value", 
"a"), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"
)))

and a vector with the ID of each element of the list

IDs <- c("H1F102", "H1F105", "H1F106")

The purrr code that is working so far

a <- ArchivosTemp %>% map(read_csv) %>% map(~rename(.x, Temperatura = Value, Date.Time = `Date/Time`)) %>% map(~mutate(.x, Date.Time = dmy_hms(Date.Time))) %>% map(~select(.x, Date.Time, Temperatura))

Since you cant read the csvs from mu computer lets replace the ArchivosTemp %>% map(read_csv) with the list that I made above

a <- Temperaturas %>% map(~rename(.x, Temperatura = Value, Date.Time = `Date/Time`)) %>% map(~mutate(.x, Date.Time = dmy_hms(Date.Time))) %>% map(~select(.x, Date.Time, Temperatura))

Then I want each of the 3 data frames to have a column called ID with its corresponding element in the IDs vector I tried this:

a <- Temperaturas %>% map(~rename(.x, Temperatura = Value, Date.Time = `Date/Time`)) %>% map(~mutate(.x, Date.Time = dmy_hms(Date.Time))) %>% map(~select(.x, Date.Time, Temperatura))  %>% map2(y = IDs,~mutate(.x, ID = y.))

but it does not work, any ideas of What I am doing wrong?

Expected outcome

As an example this is the results I expect using only the first data frame

a <- Temperaturas %>% map(~rename(.x, Temperatura = Value, Date.Time = `Date/Time`)) %>% map(~mutate(.x, Date.Time = dmy_hms(Date.Time))) %>% map(~select(.x, Date.Time, Temperatura)) %>% reduce(rbind)

mutate(a[[1]], ID = IDs[1])

which turns into

# A tibble: 6 x 3
            Date.Time Temperatura     ID
               <dttm>       <int>  <chr>
1 2020-07-01 16:14:55          28 H1F102
2 2020-07-01 16:15:55          24 H1F102
3 2020-07-01 16:16:55          25 H1F102
4 2020-07-01 16:17:55          25 H1F102
5 2020-07-01 16:18:55          25 H1F102
6 2020-07-01 16:19:55          25 H1F102

Upvotes: 0

Views: 1202

Answers (1)

akuiper
akuiper

Reputation: 214927

You have a minor parameter problem with map2, the parameters are named as .x, .y, changing y to .y works for me:

map2(.y = IDs, ~ mutate(.x, ID = .y))

Besides if you eventually need to bind all elements in the list as a single data frame, you can set_names to your list with the IDs vector and then specify the .id parameter in map_df, which will map and bind_rows of all data frames in the lists to form a new final data frame, and converts the list names to a new column with the name of .id:

Temperaturas %>% 
    set_names(IDs) %>% 
    map_df(~ transmute(.x, Date.Time=dmy_hms(`Date/Time`), Temperatura=Value), .id="ID")

# A tibble: 18 x 3
#       ID           Date.Time Temperatura
#    <chr>              <dttm>       <int>
# 1 H1F102 2020-07-01 16:14:55          28
# 2 H1F102 2020-07-01 16:15:55          24
# 3 H1F102 2020-07-01 16:16:55          25
# 4 H1F102 2020-07-01 16:17:55          25
# 5 H1F102 2020-07-01 16:18:55          25
# 6 H1F102 2020-07-01 16:19:55          25
# 7 H1F105 2020-06-12 16:19:44          31
# 8 H1F105 2020-06-12 16:20:44          29
# 9 H1F105 2020-06-12 16:21:44          27
#10 H1F105 2020-06-12 16:22:44          26
#11 H1F105 2020-06-12 16:23:44          26
#12 H1F105 2020-06-13 16:00:44          24
#13 H1F106 2016-06-07 07:54:01          23
#14 H1F106 2016-06-07 08:54:01          19
#15 H1F106 2016-06-07 09:54:01          25
#16 H1F106 2016-06-07 10:54:01          27
#17 H1F106 2016-06-07 11:54:01          30
#18 H1F106 2016-06-07 12:54:01          34

Besides, you can use transmute as a short hand for rename %>% mutate %>% select

Upvotes: 4

Related Questions