Pushpraj Verma
Pushpraj Verma

Reputation: 33

R count columns values

Help with R: I need to group by a column and count occurrences of values in a set of columns.

Here is my data frame

ID  Ob1 Ob2 Ob3 Ob4
3792    0   0   0   1
3792    0   0   -1  0
3792    1   -2  -1  0
3792    2   -2  -1  0
8060    -1  0   -2  2
8060    -1  0   -3  0
8060    0   0   0   0
13098   0   0   0   0
13098   -1  0   -1  0
13098   0   0   0   0

I want to groupby ID and count the values of numbers in Ob1, Ob2, Ob3, Ob4. so my result should look like below;

Group   -3  -2  -1  0   1   2
3792        2   3   8   2   1
8060    1   1   2   7       1
13098           2   10      

I tried

table(unlist(df))

but I loose the groupby info. I get below

-3    -2    -1     0     1     2    46  3792  8060 13098 
 1     3     7    25     2     2    10     4     3     3

Upvotes: 2

Views: 1764

Answers (5)

d.b
d.b

Reputation: 32538

with(reshape2::melt(df1, id.vars = "ID"), table(ID, value))
#       value
#ID      -3 -2 -1  0  1  2
#  3792   0  2  3  8  2  1
#  8060   1  1  2  7  0  1
#  13098  0  0  2 10  0  0

Upvotes: 1

akrun
akrun

Reputation: 886938

Using table from base R

table(data.frame(col1 = df1$ID, col2 = unlist(df1[-1])))
#   col2
#col1    -3 -2 -1  0  1  2
#  3792   0  2  3  8  2  1
#  8060   1  1  2  7  0  1
#  13098  0  0  2 10  0  0

There is also no need to create a data.frame. The above is done so that the lengths are same (as there is recycling in data.frame) and table needs arguments to be of same length. So, a variant of the above should also work well

table(rep(df1$ID, ncol(df1)-1), unlist(df1[-1]))

If we are unlisting the whole dataset, it is just a single vector, instead we need two vectors, one for the first column 'ID' and the second for all other columns except the 'ID'

data

df1 <- structure(list(ID = c(3792L, 3792L, 3792L, 3792L, 8060L, 8060L, 
8060L, 13098L, 13098L, 13098L), Ob1 = c(0L, 0L, 1L, 2L, -1L, 
-1L, 0L, 0L, -1L, 0L), Ob2 = c(0L, 0L, -2L, -2L, 0L, 0L, 0L, 
0L, 0L, 0L), Ob3 = c(0L, -1L, -1L, -1L, -2L, -3L, 0L, 0L, -1L, 
0L), Ob4 = c(1L, 0L, 0L, 0L, 2L, 0L, 0L, 0L, 0L, 0L)),
   class = "data.frame", row.names = c(NA, 
-10L))

Upvotes: 1

CrunchyTopping
CrunchyTopping

Reputation: 814

This just requires dplyr

df<-read.table(text="
  ID  Ob1 Ob2 Ob3 Ob4
  3792    0   0   0   1
  3792    0   0   -1  0
  3792    1   -2  -1  0
  3792    2   -2  -1  0
  8060    -1  0   -2  2
  8060    -1  0   -3  0
  8060    0   0   0   0
  13098   0   0   0   0
  13098   -1  0   -1  0
  13098   0   0   0   0
",header=T)
  df
  library(dplyr)
melt(df,id.vars = "ID")%>%
  cast(.,ID~value)

Result:

Aggregation requires fun.aggregate: length used as default
     ID -3 -2 -1  0 1 2
1  3792  0  2  3  8 2 1
2  8060  1  1  2  7 0 1
3 13098  0  0  2 10 0 0

Upvotes: 0

stevec
stevec

Reputation: 52208

Try using dplyr. Assume the datafame is called df

library(dplyr)

df %>% 
  group_by(ID) %>%
  summarise(Obs1 = sum(Obs1), Obs2 = sum(Obs2), Obs3 = sum(Obs3), Obs4 = sum(Obs4))

Upvotes: 0

tmfmnk
tmfmnk

Reputation: 39858

One dplyr and tidyr possibility could be:

df %>%
 gather(var, val, -ID) %>%
 select(-var) %>%
 count(ID, val) %>%
 spread(val, n, fill = 0)

     ID  `-3`  `-2`  `-1`   `0`   `1`   `2`
  <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1  3792     0     2     3     8     2     1
2  8060     1     1     2     7     0     1
3 13098     0     0     2    10     0     0

Upvotes: 1

Related Questions