Reputation: 69
My dataframe is as follows. (similar, there are actually many more rows and columns)
Gender Energetic Weekly_Apple Weekly_Banana
1 Female 3 No Yes
2 Female 3 No Yes
3 Male 5 No Yes
4 Male 2 No No
5 Female 1 No No
I want brief code that outputs the following, based on aggregating the "Yes" responses :
Male Female
Apples 0 0
Bananas 1 2
The number of apples that each gender eats =0. 1 male & 2 females eats apple.
I have tried the following:
count(original_data, c("Gender","Weekly_Apple"))
count(original_data, c("Gender","Weekly_Banana"))
count(original_data, c("Gender","Weekly_Grape"))
count(original_data, c("Gender","Weekly_PineApple"))
aggregate(x = original_data[c("Weekly_Apple",
"Weekly_Banana",
"Weekly_Grape")],
by = original_data[c("Gender")],
FUN = n())
Upvotes: 0
Views: 405
Reputation: 887711
Another base R
version with tapply
t(sapply(names(df)[3:4], function(nm) with(df, tapply(df[[nm]]=="Yes", Gender,sum))))
# Female Male
#Weekly_Apple 0 0
#Weekly_Banana 2 1
Or with split
sapply(split(df[3:4], df$Gender), function(x) colSums(x == "Yes"))
or a variation of it
sapply(split(as.data.frame(df[3:4] == "Yes"), df$Gender), colSums)
Upvotes: 0
Reputation: 6769
As suggested by NelsonGon, I have replaced df1 <- t(df1)
with a tidyr::crossing(df1)
.
library(dplyr)
df<-data.frame(
Gender=c("Female", "Female", "Male", "Male", "Female"),
Energetic =c(3,3,5,2,1),
Weekly_Apple = c("No", "No", "No", "No", "No"),
Weekly_Banana = c("Yes", "Yes", "Yes", "No", "No"))
df1 <- df %>%
group_by(Gender) %>%
summarise(
Apples = sum(Weekly_Apple=="Yes"),
Bananas = sum(Weekly_Banana =="Yes")
)
df1 <- tidyr::crossing(df1)
Upvotes: 2
Reputation: 13319
A dplyr-tidyr
alternative:
df %>%
group_by(Gender) %>%
summarise_at(vars(contains("Weekly")), function(x) sum(x=="Yes")) %>%
tidyr::gather(key, val , -Gender) %>%
tidyr::spread(Gender, val)
# A tibble: 2 x 3
key Female Male
<chr> <int> <int>
1 Weekly_Apple 0 0
2 Weekly_Banana 2 1
Data:
df <- structure(list(Gender = structure(c(1L, 1L, 2L, 2L, 1L), .Label = c("Female",
"Male"), class = "factor"), Energetic = c(3, 3, 5, 2, 1), Weekly_Apple = structure(c(1L,
1L, 1L, 1L, 1L), .Label = "No", class = "factor"), Weekly_Banana = structure(c(2L,
2L, 2L, 1L, 1L), .Label = c("No", "Yes"), class = "factor")), class = "data.frame", row.names = c(NA,
-5L))
Upvotes: 1
Reputation: 79328
you could use base R:
table(reshape(cbind(df,id=1:nrow(df)),3:4,idvar = "id",dir="long",sep="_")[-(2:3)])[,,'Yes']
time
Gender Apple Banana
Female 0 2
Male 0 1
or even
xtabs(Weekly~time+Gender,transform(reshape(cbind(df,id=1:nrow(df)),3:4,idvar = "id",dir="long",sep="_"),Weekly=Weekly=="Yes"))
Gender
time Female Male
Apple 0 0
Banana 2 1
Upvotes: 1
Reputation: 40171
One data.table
possibility could be:
dcast(variable ~ Gender,
value.var = "value",
fun = function(x) sum(x == "Yes"),
data = melt(df[-2], id.vars = "Gender"))
variable Female Male
1 Weekly_Apple 0 0
2 Weekly_Banana 2 1
Upvotes: 1