Reputation: 99
I know that there are already some threads about it, but I haven't found one yet about this specific problem. The dependent variable in my dataset is Y and I have 144 independent variables. Y and X can take only the values 1 or 0. The data looks like
Y A469 T593 K022K A835 Z935 U83F W5326 ...
Person1 1 1 1 1 0 0 0 0
Person2 1 0 1 0 1 1 0 0
Person3 0 0 0 1 0 0 1 1
...
summary(dataset)
just provides descriptive statistics over all observations. What I want is (in pseudo-code):
summary(all variables if Y == 1 and Y == 0)
It would be great if I could see how often a certain X occurs in the certain value of Y. For example, mean(X4) = 0.04 and count = 6 if Y = 1.
Upvotes: 0
Views: 419
Reputation: 11981
EDIT 2 after Akrun's and Gregor's comments here is the solution
data_summary <- dataset %>% group_by(y) %>%
mutate(n = n()) %>%
summarise_all(mean)
If you want to see more columns than fit on your screen you can try, e.g.,
print(data_summary, width = 20)
View(data_summary)
select(data_summary, <<particular columns you want to see>>)
Upvotes: 2