Reputation: 63
I have a data.table
DT
with two columns
V1 V2
1: 1 3
2: 2 4
3: 3 5
4: 2 2
5: 3 8
6: 1 4
7: 2 5
For each row I want to take all the entries with the same V1
and add the V2
entries then divide the V2
entry by that sum and add in a third column. For example, in row 1, column 3 the answer should be 3/(3+4). In row 2, column 3 the answer should be 4/(4+2+5), etc.
Ultimately I should have
V1 V2 V3
1: 1 3 0.4285714
2: 2 4 0.3636364
3: 3 5 0.3846154
4: 2 2 0.1818182
5: 3 8 0.6153846
6: 1 4 0.5714286
7: 2 5 0.4545455
I can get V3
via q <- DT[,V2/sum(V2),by='V1']
but then the rows are in the wrong order
V1 V1
1: 1 0.4285714
2: 1 0.5714286
3: 2 0.3636364
4: 2 0.1818182
5: 2 0.4545455
6: 3 0.3846154
7: 3 0.6153846
so simply pasting the second column of q
to DT
will not work. Also, somewhat awkwardly the data.table
q
now has two columns with the same name, V1
.
I have been banging my head on this problem for a few days now, searched high and low and still cannot come up with a simple answer. Any help would be much appreciated.
Upvotes: 4
Views: 1876
Reputation: 60014
Create a new table using
DT[,list(V2=V2, V3=V2/sum(V2)), by='V1']
(with modified row order) or modify the data.table
in place using the assignment operator:
DT[, V3 := V2/sum(V2), by='V1']
Note that now the row order is the same.
Please RTFM (half the questions I ask about data.table
I could have answered myself if I spent an extra 30 minutes perusing it!)
As for the row order in the new table, I don't think you can easily preserve it. It is also unclear why you would want to preserve the order unless there is another column which dictates it already, in which case you can sort the new table accordingly.
Upvotes: 4