skan
skan

Reputation: 7720

Different ways to get summaries with data.table R

temp <- data.table(fir=c("A", "B", "B", "C", "A", "D"), sec=c(1,1,1,1,2,2))

 fir sec
  A   1
  B   1
  B   1
  C   1
  A   2
  D   2

If I want to get a summary by the "sec" column, for example just counting the number of occurences. I can try...

method a)

 temp[,.N, by=sec]


  sec N
  1:   1 4
  2:   2 2

We get as many of rows as different levels we have at "sec".

method b)

 temp[,Num:=.N, by=sec]

Same summary but keeping all the columns and the same number of rows.

 fir sec Num
  A   1   4
  B   1   4
  B   1   4
  C   1   4
  A   2   2
  D   2   2

But...
How can get a result like method a) but specifying the name of the new column? I mean without needing to explicitly changing the names later.
I've tried with Num=.N without the := but it doesn't work.

How can get a result like method b) but without explicitly writing the name of the new column and without modifying the original datatable? (like ave()) I mean running something like this

 temp[,.N, by=sec]

but getting

 fir sec  N
  A   1   4
  B   1   4
  B   1   4
  C   1   4
  A   2   2
  D   2   2

Upvotes: 1

Views: 144

Answers (1)

akrun
akrun

Reputation: 886948

We can use rep

temp[,.(Num = rep(.N, .N)), by=sec]

If we need to get the other variables, one option is on

temp[temp[, .(Num = .N), by=sec], on = .(sec)]

Upvotes: 2

Related Questions