Using functions in j with by - basic question

Question

I need to understand why sometimes you need to use a "by", and sometimes you don't. I'm really new to both R and data.table, so it is probably something basic.

a<-c("A","B","C")
b<-c("AA","BBB","CCC")
x1<-c(2,4,8)
x2<-c(2,4,1)
n1<-c(9,9,9)
n2<-c(10,10,10)

DT <-data.table(a,b,x1,x2,n1,n2)

test1 <- DT[,.(y=nchar(b))]
test2 <- DT[,.(pv1=prop.test(c(x1,x2), c(n1,n2))$p.value)]
test3 <- DT[,.(pv1=prop.test(c(x1,x2), c(n1,n2))$p.value), by= 'a']

test1 behaves as I expected, it returns a data table with 3 observations and 1 variable.

test2 confused me. I get get only 1 observation back

test3 is how I got the answer I expected.

I don't understand why test2 did not operate row-wise like test1 did. When do you need to use a by= if you want to process every row in the table?

Thanks for your help,

David

Using functions in j with by - basic question

Answers (1)

Related Questions