How to loop data in R?

Question

Here is a piece of my data:

data_x <- tribble(
  ~price, ~bokey,    ~id,     ~cost,    ~revenue,
     1,     "a",      10,      0.20,       30,      
     2,     "b",      20,      0.30,       60,  
     3,     "c",      20,      0.30,       40,
     4,     "d",      10,      0.20,      100, 
     5,     "e",      30,      0.10,       40,
     6,     "f",      10,      0.20,       10,
     1,     "g",      20,      0.30,       80,
     2 ,    "h",      10,      0.20,       20,
     3,     "h",      30,      0.10,       20,
     3,     "i",      20,      0.30,       40,
)

As you see, there are three different type of IDs: 10, 20, 30. But in the real data, there are almost 100 ids. I want to aggregate the data based on these ids. Because I don't know how to do it in loop, I basically created some subsets:

data_10 <- data_x %>% filter(id == 10)
data_20 <- data_x %>% filter(id == 20)
data_30 <- data_x %>% filter(id == 30)

Here is the aggregated data:

data_agg <- data_10 %>% 
  group_by(priceseg = cut(as.numeric(price), c(0, 1, 3, 5, 6))) %>% 
  summarise(price_n = n_distinct(bokey),
            Cost = sum(cost, na.rm =  T),
            Revenue  =  sum(revenue, na.rm = T),
            clicks = n_distinct(bokey)) %>% 
  mutate(price_n2 = round(100 * prop.table(price_n), 2),
         (zet = Cost/Revenue))

But I want to have one more column that shows the id. Here is the desired data:

data_desired <- tribble(
  ~id, ~priceseg, ~price_n, ~Cost,  ~Revenue,  ~clicks,  ~price_n2, ~`(zet = Cost/Revenue)`
   10,   (0,1]        1      0.2        30         1        25             0.00667
   10,   (1,3]        1      0.2        20         1        25                0.01       
   10,   (3,5]        1      0.2       100         1        25               0.002     
   10,   (5,6]        1      0.2        10         1        25                0.02       
   20,
   20,
   .
   .
)  30,

How can I get it?

David Jorquera · Accepted Answer

Since you are already using dplyr, just add id as one of the grouping variables (no need to previously separate your data):

data_agg <- data_x %>% 
  group_by(id, priceseg = cut(as.numeric(price), c(0, 1, 3, 5, 6))) %>% 
  summarise(price_n = n_distinct(bokey),
        Cost = sum(cost, na.rm =  T),
        Revenue  =  sum(revenue, na.rm = T),
        clicks = n_distinct(bokey)) %>% 
  mutate(price_n2 = round(100 * prop.table(price_n), 2),
     (zet = Cost/Revenue))



# A tibble: 8 x 8
# Groups:   id [3]
#      id priceseg price_n  Cost Revenue clicks price_n2 `(zet = Cost/Revenue)`
#                                      
# 1    10 (0,1]          1 0.2        30      1       25                0.00667
# 2    10 (1,3]          1 0.2        20      1       25                0.01   
# 3    10 (3,5]          1 0.2       100      1       25                0.002  
# 4    10 (5,6]          1 0.2        10      1       25                0.02   
# 5    20 (0,1]          1 0.3        80      1       25                0.00375
# 6    20 (1,3]          3 0.900     140      3       75                0.00643
# 7    30 (1,3]          1 0.1        20      1       50                0.005  
# 8    30 (3,5]          1 0.1        40      1       50                0.0025

How to loop data in R?

Answers (2)

Related Questions