Reputation: 271
I'm trying to combine rows in a dataframe called all_pdat whose head looks like this:
diff_score abserror
190 58 16.17851166
1140 58 12.55945835
2152 58 93.52071253
370 57 11.08828322
1142 57 0.07710115
230 56 90.87347961
What I would like to do is combine rows with the same diff_score value such that the abserror column reflects mean values of the combined rows. So the new df (pdat) head would look like this:
diff_score avg_error
190 58 40.7528941
370 57 5.58269218
230 56 90.87347961
I have tried the following, but it just gives me a df with a single row:
pdat <- all_pdat %>%
group_by(diff_score) %>%
summarise(avg_error = mean(abserror))
Thanks in advance.
Upvotes: 1
Views: 501
Reputation: 887048
We can also use data.table
library(data.table)
setDT(df)[, .(absmeanerror = mean(abserror), diff_score]
Upvotes: 0
Reputation: 321
A way to do it would be using the by
function, like this:
temp=by(all_pdat[,'abserror'],all_pdat[,'diff_Score'],mean)
pdat=data.frame('diff_score'=names(temp),'abserror'=c(temp)))
Upvotes: 0
Reputation: 39595
I would suggest a base R
solution:
#Data
df <- structure(list(diff_score = c(58L, 58L, 58L, 57L, 57L, 56L),
abserror = c(16.17851166, 12.55945835, 93.52071253, 11.08828322,
0.07710115, 90.87347961)), class = "data.frame", row.names = c(NA,
-6L))
The code:
dfn <- aggregate(abserror~diff_score,data=df,mean,na.rm=T)
Output:
diff_score abserror
1 56 90.873480
2 57 5.582692
3 58 40.752894
And the dplyr
approach:
library(dplyr)
df %>% group_by(diff_score) %>% summarise(mean_abserror=mean(abserror))
Output:
# A tibble: 3 x 2
diff_score mean_abserror
<int> <dbl>
1 56 90.9
2 57 5.58
3 58 40.8
Maybe your issue is due to a conflict with other package.
Upvotes: 0