Reputation: 1292
Like the title, it's complicated to describe , so I'll just show the code , what I got and what I want it to be.
set.seed(1)
df<-data.frame('X1'=rnorm(10),
'X2'=rnorm(10),
'X3'=c(c(rep('A',5)),c(rep('B',5))))
## create a bew column 'SPX2' which is the smallest positive number OF X2
## of each group(A and B)
require(data.table)
setDT(df)[X2>0,SPX2:=min(X2),by=X3]
df
then I got the result as:
X1 X2 X3 SPX2
1: -0.6264538 1.51178117 A 0.3898432
2: 0.1836433 0.38984324 A 0.3898432
3: -0.8356286 -0.62124058 A NA
4: 1.5952808 -2.21469989 A NA
5: 0.3295078 1.12493092 A 0.3898432
6: -0.8204684 -0.04493361 B NA
7: 0.4874291 -0.01619026 B NA
8: 0.7383247 0.94383621 B 0.5939013
9: 0.5757814 0.82122120 B 0.5939013
10: -0.3053884 0.59390132 B 0.5939013
and what I want is :
X1 X2 X3 SPX2
1: -0.6264538 1.51178117 A 0.3898432
2: 0.1836433 0.38984324 A 0.3898432
3: -0.8356286 -0.62124058 A 0.3898432
4: 1.5952808 -2.21469989 A 0.3898432
5: 0.3295078 1.12493092 A 0.3898432
6: -0.8204684 -0.04493361 B 0.5939013
7: 0.4874291 -0.01619026 B 0.5939013
8: 0.7383247 0.94383621 B 0.5939013
9: 0.5757814 0.82122120 B 0.5939013
10: -0.3053884 0.59390132 B 0.5939013
cause I want to create a new column df$X4<-df$SPX2 - df$X2
,o any other operations that require SPX2
to be like above.
I did my search and found several posts like the one here , but that's not what I try do here.
Anyone know how to achieve this?
Upvotes: 4
Views: 1886
Reputation: 1495
Using the data.table
package:
setDT(df)
df[,SPX2:=min(X2[X2 > 0]),by=X3]
What this does is for each value of X3
, subset on positive values of X2
(i.e. X2[X2 > 0]
) and then take the minimum over all positive values. Note that if there are no positive values (i.e. X2[X2 > 0]
is empty), then the resulting value would be Inf
. Keep this in mind especially if you want to do any further calculations using SPX2
.
As per your question why X2[X2 > 0]
works, think about it as follows: For each value of X3
, a vector of corresponding values of X2
is returned. Now, you can just perform regular vector operations on this vector, one of which is subsetting via X2 > 0
. It works much like the following:
x2 = c(-1, 1, 2, 3, -2, 4)
x2[x2 > 0]
# [1] 1 2 3 4
Hope this helps!
Upvotes: 1
Reputation: 2496
tidyverse
alternative:
df %>%
group_by(X3) %>%
mutate(SPX2 = min(X2[X2>0]))
which gives:
X1 X2 X3 SPX2
<dbl> <dbl> <fctr> <dbl>
1 -0.6264538 1.51178117 A 0.3898432
2 0.1836433 0.38984324 A 0.3898432
3 -0.8356286 -0.62124058 A 0.3898432
4 1.5952808 -2.21469989 A 0.3898432
5 0.3295078 1.12493092 A 0.3898432
6 -0.8204684 -0.04493361 B 0.5939013
7 0.4874291 -0.01619026 B 0.5939013
8 0.7383247 0.94383621 B 0.5939013
9 0.5757814 0.82122120 B 0.5939013
10 -0.3053884 0.59390132 B 0.5939013
Upvotes: 2