Lola1993
Lola1993

Reputation: 161

Mutate within a for loop

I have a dataframe like this

structure(list(a = c(1, 3, 4, 6, 3, 2, 5, 1), b = c(1, 3, 4, 
2, 6, 7, 2, 6), c = c(6, 3, 6, 5, 3, 6, 5, 3), d = c(6, 2, 4, 
5, 3, 7, 2, 6), e = c(1, 2, 4, 5, 6, 7, 6, 3), f = c(2, 3, 4, 
2, 2, 7, 5, 2)), .Names = c("Love_ABC", "Love_CNN", "Hate_ABC", "Hate_CNN", "Love_CNBC", "Hate_CNBC"), row.names = c(NA, 
8L), class = "data.frame")

I have made the following for loop

channels = c("ABC", "CNN", "CNBC")

for (channel in channels) { 
dataframe <- dataframe %>%
  mutate(ALL_channel = Love_channel + Hate_channel)
  }

But when i run the for loop R tells me " object Love_channel" not found. Have i done something wrong in the for loop?

Upvotes: 1

Views: 253

Answers (2)

Edo
Edo

Reputation: 7818

This is a solution with dplyr and tidyr:

library(tidyr)
library(dplyr)

dataframe <- dataframe %>%
  tibble::rowid_to_column()

dataframe %>% 
  pivot_longer(-rowid, names_to = c(NA, "channel"), names_sep = "_") %>% 
  pivot_wider(names_from = channel, names_prefix = "ALL_", values_from = value, values_fn = sum) %>% 
  right_join(dataframe, by = "rowid") %>% 
  select(-rowid)
#> # A tibble: 8 x 9
#>   ALL_ABC ALL_CNN ALL_CNBC Love_ABC Love_CNN Hate_ABC Hate_CNN Love_CNBC Hate_CNBC
#>     <dbl>   <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>     <dbl>     <dbl>
#> 1       7       7        3        1        1        6        6         1         2
#> 2       6       5        5        3        3        3        2         2         3
#> 3      10       8        8        4        4        6        4         4         4
#> 4      11       7        7        6        2        5        5         5         2
#> 5       6       9        8        3        6        3        3         6         2
#> 6       8      14       14        2        7        6        7         7         7
#> 7      10       4       11        5        2        5        2         6         5
#> 8       4      12        5        1        6        3        6         3         2

The idea is to reshape it to make the sums easier. Then you can join the final result back to the initial dataframe.

  • start by uniquely identifying each row with a rowid.
  • reshape with pivot_longer so to have all values neatly in one column. In this step you also separate the names Love/Hate_channel in two and you remove the Love/Hate part (you are interested only on the channel) [that is what the NA does!].
  • reshape again: this time you want to get one column for each channel. In this step you also sum up what previously was Love and Hate together for each rowid and channel (that's what values_fn=sum does!). Also you add a prefix (names_prefix = "ALL_") to each new column name to have names that respect your expected final result.
  • with right_join you add the values back to the original dataframe. You have no need for rowid now, so you can remove it.

Upvotes: 2

Cole
Cole

Reputation: 11255

Here's a way with rlang. Note, reshaping the data is likely more straightforward. Non-standard evaluation (NSE) is a complicated topic.

for (channel in channels) { 
  DF <- DF %>%
    mutate(!!sym(paste0("ALL_", channel)) := !!sym(paste0("Love_", channel)) + !!sym(paste0("Hate_", channel)))
}
DF

##   Love_ABC Love_CNN Hate_ABC Hate_CNN Love_CNBC Hate_CNBC ALL_ABC ALL_CNN ALL_CNBC
## 1        1        1        6        6         1         2       7       7        3
## 2        3        3        3        2         2         3       6       5        5
## 3        4        4        6        4         4         4      10       8        8
## 4        6        2        5        5         5         2      11       7        7
## 5        3        6        3        3         6         2       6       9        8
## 6        2        7        6        7         7         7       8      14       14
## 7        5        2        5        2         6         5      10       4       11
## 8        1        6        3        6         3         2       4      12        5

Upvotes: 2

Related Questions