Reputation: 443
Hi I have a data like this:
date type data
198101 1 0.1
198101 1 0.3
198101 2 0.5
198102 1 1.2
198102 1 0.9
198102 2 0.7
198102 2 0.3
I would like to create a new column to show the median each month according to criteria when type == 1.
The result I would like to be is like this
date type data P50
198101 1 0.1 0.2
198101 1 0.3 0.2
198101 2 0.5 0.2
198102 1 1.2 1.05
198102 1 0.9 1.05
198102 2 0.7 1.05
198102 2 0.3 1.05
currently I do it this way, lets call the above data.table as dt
dt.median = dt[type == 1]
dt.median = dt.median[, .(P50 = median(data)), by=.(date)]
Then merge it back into the original dt
dt = dt[dt.median, nomatch = 0, by=.(date)]
Is there a quicker way to do this using .SD or .SDcol? I want to practice using .SD but just cannot figure it out with maybe one line of code?
What I could think of is currently
dt[, P50 := * .SD[type == 1] ... * , by =.(date)]
but then I dont know what the syntax to put in to calculate median * .SD[type == 1] ... *,
Help will be much appreciated!
Upvotes: 0
Views: 653
Reputation: 323236
From base R
v=dt$data
v[dt$type!=1]=NA
ave(v,dt$date,FUN=function(x) median(x,na.rm=T))
[1] 0.20 0.20 0.20 1.05 1.05 1.05 1.05
Upvotes: 0
Reputation: 263352
Just index the data values within groups using a logical vector and assign with the data.table special assignment operator, :=
> dt[ , P50 := median(data[type==1]), by=.(date)]
> dt
date type data P50
1: 198101 1 0.1 0.20
2: 198101 1 0.3 0.20
3: 198101 2 0.5 0.20
4: 198102 1 1.2 1.05
5: 198102 1 0.9 1.05
6: 198102 2 0.7 1.05
7: 198102 2 0.3 1.05
Upvotes: 2