Reputation: 75
I am relatively new to R, and am trying to count the number of each value for each variable, in my whole data frame, where this would all be summarised into a new data frame. For example, my data looks like this:
cluster <- data.frame(sex = c(1,1,1,1,0),
mut = c(0,0,0,0,0),
ht = c(1,1,0,1,0),
wt = c(0,1,1,0,1),
group = c(1,0,0,0,0))
cluster
sex mut ht wt group
1 0 1 0 1
1 0 1 1 0
1 0 0 1 0
1 0 1 0 0
0 0 0 1 0
And I want to count how many 1's vs 0's of each variable there is, for the whole data frame. My desired output is:
Zeroes Ones
sex 1 4
mut 5 0
ht 2 3
wt 2 3
group 4 1
I know how to do this for each variable individually through a variety of means, for example:
>table(cluster$sex)
0 1
1 4
but I have 32 variables in each of 6 data frames so a quicker way to summarise this would be very helpful. I am thinking some sort of looping function, although I am not very knowledgeable in those. Any help would be greatly appreciated!
Upvotes: 1
Views: 87
Reputation: 12155
You can apply a function by column using apply
:
df <- apply(cluster, 2, function(x) c('one' = sum(x == 1), 'zero' = sum(x == 0)))
df <- data.frame(t(df)) # Rotate it so categories are rows
df
one zero
sex 4 1
mut 0 5
ht 3 2
wt 3 2
group 1 4
Upvotes: 2
Reputation: 323226
stack
with table
(PS: convert to data.frame as.data.frame.matrix
)
with(stack(df),table(ind,values))
0 1
group 4 1
ht 2 3
mut 5 0
sex 1 4
wt 2 3
Upvotes: 0