Reputation: 1827
I would like to aggregade dataframe based on partial string filtering.
My example is as follows:
RMSE MAPE term count
<dbl> <dbl> <chr> <dbl>
1 20.3 0.146 (Intercept) 1
2 20.3 0.146 as.factor(Gear)420599 1
3 20.3 0.146 as.factor(Gear)433453 1
and the reuslt should look like this
RMSE MAPE term count
<dbl> <dbl> <chr> <dbl>
1 20.3 0.146 (Intercept) 1
2 20.3 0.146 as.factor(Gear) 2
Upvotes: 0
Views: 22
Reputation: 375
This works for your example. \D
is the regex for "not a digit" so I'm not sure if it will work more generally for you.
df <- tribble(
~rmse, ~mape, ~term, ~count,
20.3, 0.146, "(Intercept)", 1,
20.3, 0.146, "as.factor(Gear)420599", 1,
20.3, 0.146, "as.factor(Gear)433453", 1)
df %>%
mutate(term = str_extract(term, "(\\D)*")) %>%
group_by(rmse, mape, term) %>%
summarize(count = sum(count))
str_c(str_extract(term, "[^\\)]*"),")")
might work better.
Upvotes: 1