Petr
Petr

Reputation: 1827

Aggregade dataframes in R based on string filtering

I would like to aggregade dataframe based on partial string filtering.

My example is as follows:

     RMSE  MAPE term                                      count
   <dbl> <dbl> <chr>                                     <dbl>
 1  20.3 0.146 (Intercept)                                   1
 2  20.3 0.146 as.factor(Gear)420599                         1
 3  20.3 0.146 as.factor(Gear)433453                         1

and the reuslt should look like this

     RMSE  MAPE term                                      count
   <dbl> <dbl> <chr>                                     <dbl>
 1  20.3 0.146 (Intercept)                                   1
 2  20.3 0.146 as.factor(Gear)                               2

Upvotes: 0

Views: 22

Answers (1)

dpmcsuss
dpmcsuss

Reputation: 375

This works for your example. \D is the regex for "not a digit" so I'm not sure if it will work more generally for you.

df <- tribble(
  ~rmse, ~mape, ~term, ~count,
  20.3, 0.146, "(Intercept)", 1,
  20.3, 0.146, "as.factor(Gear)420599", 1,
  20.3, 0.146, "as.factor(Gear)433453", 1)

df %>%
  mutate(term = str_extract(term, "(\\D)*")) %>%
  group_by(rmse, mape, term) %>%
  summarize(count = sum(count))

str_c(str_extract(term, "[^\\)]*"),")") might work better.

Upvotes: 1

Related Questions