Reputation: 9018
I have a dataframe:
dat<- data.frame(date = c("2015-01-01","2015-01-01","2015-01-01", "2015-01-01","2015-02-02","2015-02-02","2015-02-02","2015-02-02","2015-02-02"), val= c(10,20,30,50,300,100,200,200,400), type= c("A","A","B","C","A","A","B","C","C") )
dat
date val type
1 2015-01-01 10 A
2 2015-01-01 20 A
3 2015-01-01 30 B
4 2015-01-01 50 C
5 2015-02-02 300 A
6 2015-02-02 100 A
7 2015-02-02 200 B
8 2015-02-02 200 C
9 2015-02-02 400 C
and I would like to have one row for each day with averages by type so the output would be:
Date A B C
2015-01-01 15 30 50
2015-02-02 200 200 300
additionally how would I get the counts so the results are:
Date A B C
2015-01-01 2 1 1
2015-02-02 2 1 2
Upvotes: 5
Views: 10611
Reputation: 2881
I'll add the pivot_wider
solution, which is meant to replace earlier tidyverse
options, and which is
Using pivot_wider
with the values_fn
option, we can do the following:
library(tidyr) # At least 1.0.0
dat %>% pivot_wider(names_from = type, values_from = val, values_fn = list(val = mean))
#> # A tibble: 2 x 4
#> date A B C
#> <fct> <dbl> <dbl> <dbl>
#> 1 2015-01-01 15 30 50
#> 2 2015-02-02 200 200 300
and
dat %>% pivot_wider(names_from = type, values_from = val, values_fn = list(val = length))
#> # A tibble: 2 x 4
#> date A B C
#> <fct> <int> <int> <int>
#> 1 2015-01-01 2 1 1
#> 2 2015-02-02 2 1 2
Of course, if we want to get fancy, we can do both at once:
library(purrr)
library(rlang)
map(quos(mean, length),
~pivot_wider(dat, names_from = type, values_from = val, values_fn = list(val = eval_tidy(.))))
#> [[1]]
#> # A tibble: 2 x 4
#> date A B C
#> <fct> <dbl> <dbl> <dbl>
#> 1 2015-01-01 15 30 50
#> 2 2015-02-02 200 200 300
#>
#> [[2]]
#> # A tibble: 2 x 4
#> date A B C
#> <fct> <int> <int> <int>
#> 1 2015-01-01 2 1 1
#> 2 2015-02-02 2 1 2
Created on 2019-12-04 by the reprex package (v0.3.0)
Note that if you're concerned about speed, it may be worth updating to the dev version of tidyr.
Upvotes: 1
Reputation: 118779
Using data.table v1.9.5
(current devel), we can do:
require(data.table) ## v1.9.5+
dcast(setDT(dat), date ~ type, fun = list(mean, length), value.var="val")
# date A_mean_val B_mean_val C_mean_val A_length_val B_length_val C_length_val
# 1: 2015-01-01 15 30 50 2 1 1
# 2: 2015-02-02 200 200 300 2 1 2
Installation instructions here.
Upvotes: 2
Reputation: 145755
library(reshape2)
dcast(data = dat, formula = date ~ type, fun.aggregate = mean, value.var = "val")
# date A B C
# 1 2015-01-01 15 30 50
# 2 2015-02-02 200 200 300
With dcast
, the LHS of the formula defines rows, the RHS defines columns, the value.var
is the name of the column that becomes values, and the fun.aggregate
is how those values are computed. The default fun.aggregate
is length
, i.e., the number of values. You asked for the average, so we use mean
. You could also do min
, max
, sd
, IQR
, or any function that takes a vector and returns a single value.
Upvotes: 9
Reputation: 886938
You may also use table
for the updated question
table(dat[c(1,3)])
# type
#date A B C
#2015-01-01 2 1 1
#2015-02-02 2 1 2
For the first question, I think @Gregor's solution is the best (so far), a possible option with dplyr/tidyr
would be
library(dplyr)
library(tidyr)
dat %>%
group_by(date,type) %>%
summarise(val=mean(val)) %>%
spread(type, val)
Or a base R
option would be (nchar=50
and the dcast(..
nchar=44
. So not so bad :-))
with(dat, tapply(val, list(date, type), FUN=mean))
# A B C
#2015-01-01 15 30 50
#2015-02-02 200 200 300
Upvotes: 8
Reputation: 5586
Personally I would go with Gregor's solution using reshape2
. But for the sake of completeness I'll include a base R solution.
agg <- with(dat, aggregate(val, by = list(date = date, type = type), FUN = mean))
out <- reshape(agg, timevar = "type", idvar = "date", direction = "wide")
out
# date x.A x.B x.C
# 1 2015-01-01 15 30 50
# 2 2015-02-02 200 200 300
If you want to get rid of the x.
on the column names, you can remove it with gsub
.
colnames(out) <- gsub("^x\\.", "", colnames(out))
To get the counts of rows, replace FUN = mean
with FUN = length
in the call to aggregate
.
Upvotes: 4