Reputation: 542
I have a large data frame and I am trying to assign values to a particular data column for specific subsets.
subset(P2Y12R_binding_summary,(SYSTEM=="4NTJ")&(VARIANT=="D294N"))
SYSTEM VARIANT MODEL EPSIN INP dE_water_free dE_ERR_water_free dE_water_periodic dE_ERR_water_periodic
1 4NTJ D294N LVLSET 1 1 -42.155 29.28460 -42.205 29.52604
2 4NTJ D294N LVLSET 1 2 -34.225 29.75176 -34.235 29.96571
3 4NTJ D294N LVLSET 20 1 -65.163 40.62241 -65.163 40.52564
4 4NTJ D294N LVLSET 20 2 -57.454 41.04459 -57.454 41.26962
5 4NTJ D294N SES 1 1 -23.406 30.56636 -23.335 30.75794
6 4NTJ D294N SES 1 2 -15.434 30.70035 -15.414 30.85944
7 4NTJ D294N SES 20 1 -64.351 40.65919 -64.350 40.51345
8 4NTJ D294N SES 20 2 -56.342 41.23456 -56.542 41.21865
Now suppose I add a new column to the frame ( Ki_expt ) using
P2Y12R_binding_summary$Ki_expt <- 0
And I want to update values for this column for only the rows corresponding to the subset above.
Trying the naive approach fails:
>subset(P2Y12R_binding_summary,(SYSTEM=="4NTJ")&(VARIANT=="D294N"))$Ki_expt = 42.2
or
>subset(P2Y12R_binding_summary,(SYSTEM=="4NTJ")&(VARIANT=="D294N"))$Ki_expt <- 42.2
Both yield the error message:
Error in subset(P2Y12R_binding_summary, (SYSTEM == "4NTJ") & (VARIANT == :
could not find function "subset<-"
Does anyone know of the appropriate way to do this? Obviously, it would be possible with a for loop, but that seems rather klunky and would probably be quite slow (as previous experience seems to show).
Upvotes: 2
Views: 4219
Reputation: 320
If speed is a concern I would look to data.table
(I normally look there anyway).
library(data.table)
setDT(P2Y12R_binding_summary)[SYSTEM=="4NTJ" & VARIANT=="D294N", Ki_expt := 42.2 ]
an Example using diamonds:
library(data.table)
dummydf <- diamonds
setDT(dummydf)[cut =="Premium" & color =="J", example := 42.2 ]
dummydf[!is.na(example)]
carat cut color clarity depth table price x y z example
1: 0.30 Premium J SI2 59.3 61 405 4.43 4.38 2.61 42.2
2: 1.00 Premium J SI2 62.3 58 2801 6.45 6.34 3.98 42.2
3: 0.93 Premium J SI2 61.9 57 2807 6.21 6.19 3.84 42.2
4: 1.17 Premium J I1 60.2 61 2825 6.90 6.83 4.13 42.2
5: 0.33 Premium J VS1 62.8 58 557 4.41 4.38 2.76 42.2
---
804: 1.01 Premium J I1 60.7 59 2602 6.42 6.39 3.89 42.2
805: 1.01 Premium J SI2 58.3 62 2683 6.49 6.43 3.77 42.2
806: 1.01 Premium J SI2 59.3 56 2683 6.51 6.45 3.84 42.2
807: 0.90 Premium J SI2 62.7 57 2717 6.09 6.06 3.80 42.2
808: 0.90 Premium J SI2 63.0 59 2717 6.14 6.11 3.86 42.2
Note that you only setDT() once. after that just call your DT using dummydf[subsets, LHS name := RHS value]
Upvotes: 2