ZAWD
ZAWD

Reputation: 661

How to calculate the number of group using R?

It could be a very easy question, I have a data.table with key and more than 1000 rows, two of which could be set as key. I want to calculate the number of the groups for this dataset.

For example, the simple data is(ID and Act is key)

ID  ValueDate Act Volume
1 2015-01-01 EUR     21
1 2015-02-01 EUR     22
1 2015-01-01 MAD     12
1 2015-02-01 MAD     11
2 2015-01-01 EUR      5
2 2015-02-01 EUR      7
3 2015-01-01 EUR      4
3 2015-02-01 EUR      2
3 2015-03-01 EUR      6

Here is a code to generate test data:

dd <- data.table(ID = c(1,1,1,1,2,2,3,3,3), 
                 ValueDate = c("2015-01-01", "2015-02-01", "2015-01-    01","2015-02-01", "2015-01-01","2015-02-01","2015-01-01","2015-02-01","2015-03-01"),
                 Act = c("EUR","EUR","MAD","MAD","EUR","EUR","EUR","EUR","EUR"),
                 Volume=c(21,22,12,11,5,7,4,2,6))

in this case, we can see that there are a total of 4 subsets.

I tried to set the key for this table as first,

setkey(dd, ID, Act)

Then I thought the function of count could be working to count the groups. Is it right to use the function of count, or there could be a simple method?

Thanks a lot !

Upvotes: 3

Views: 3361

Answers (2)

jangorecki
jangorecki

Reputation: 16727

The fastest way should be uniqueN.

library(data.table)
dd <- data.table(ID = c(1,1,1,1,2,2,3,3,3), 
                 ValueDate = c("2015-01-01", "2015-02-01", "2015-01-01","2015-02-01", "2015-01-01","2015-02-01","2015-01-01","2015-02-01","2015-03-01"),
                 Act = c("EUR","EUR","MAD","MAD","EUR","EUR","EUR","EUR","EUR"),
                 Volume=c(21,22,12,11,5,7,4,2,6))
uniqueN(dd, by = c("ID", "Act"))
#[1] 4

Upvotes: 3

alexwhitworth
alexwhitworth

Reputation: 4907

nrow(dd[, .(cnt= sum(.N)), by= c("ID", "Act")])

# or using base R
{t <- table(interaction(dd$ID, dd$Act)); length(t[t>0])}

# or for the counts:
dd[, .(cnt= sum(.N)), by= c("ID", "Act")]
   ID Act cnt
1:  1 EUR   2
2:  1 MAD   2
3:  2 EUR   2
4:  3 EUR   3

Upvotes: 3

Related Questions