lnNoam
lnNoam

Reputation: 1055

Conditional replacement of one column by another ( R )

So, col's and prob's can be thought of as a package; they go together, e.g., col2 "goes with" prob2.

How can I make it so that if a cell in col is <NA>, the prob that corresponds to it will also be replaced with an <NA>?

Data:

   col2          prob2    col3            prob3     col4            prob4    col5              prob5
    2  0.126269620610401  <NA>  0.979143074247986   <NA>  0.150689669651911  <NA>   0.11148908524774
    3  0.730431054253131  <NA>  0.826114872703329   <NA>  0.368350319797173  <NA>   0.299717969959602
    2  0.320544729940593    3   0.0434977798722684   4    0.859434255165979   11    0.150506388396025
    2  0.0354198240675032   3   0.240764779038727    5    0.276169682852924  <NA>   0.0449998050462455

Goal:

   col2          prob2    col3         prob3        col4            prob4      col5          prob5
    2  0.126269620610401  <NA>          <NA>        <NA>   0.150689669651911   <NA>           <NA>
    3  0.730431054253131  <NA>          <NA>        <NA>   0.368350319797173   <NA>           <NA>
    2  0.320544729940593    3   0.0434977798722684    4    0.859434255165979    11     0.150506388396025 
    2  0.0354198240675032   3   0.240764779038727     5    0.276169682852924   <NA>           <NA>

Upvotes: 2

Views: 630

Answers (3)

IRTFM
IRTFM

Reputation: 263481

You can you use is.na.data.frame to create a result that is used by is.na<- to "NA-out" the corresponding values in the "prob" columns:

 is.na(dat[ , grep("prob", colnames(dat)) ]) <- is.na(dat[ , grep("col", colnames(dat)) ])

 #------------------
> dat
  col2      prob2 col3      prob3 col4     prob4 col5     prob5
1    2 0.12626962 <NA>         NA <NA>        NA <NA>        NA
2    3 0.73043105 <NA>         NA <NA>        NA <NA>        NA
3    2 0.32054473    3 0.04349778    4 0.8594343   11 0.1505064
4    2 0.03541982    3 0.24076478    5 0.2761697 <NA>        NA

Notice that this used your console output and it came in as any "<NA>-containing" column as a factor column. The may be a problem with your ?col?-variables since a true numeric vector with NA's would not be displayed as <NA>

Upvotes: 2

Chase
Chase

Reputation: 69241

Building off the previous answer, here's one way to do this programatically across all of your column-prob pairs:

x <- data.frame(col2 = c(NA, NA, 1,2), prob2 = runif(4), col3 = c(3,4,NA,NA), prob3 = rnorm(4))

colColumns <- grep("col", names(x))

for (j in colColumns) {
  x[ j+1] <- ifelse(is.na(x[, j]), NA, x[, j+1])
}

Resulting int:

  col2     prob2 col3    prob3
1   NA        NA    3 1.359170
2   NA        NA    4 1.165798
3    1 0.2701173   NA       NA
4    2 0.6411366   NA       NA

Upvotes: 1

user2034412
user2034412

Reputation: 4282

This will replace prob3 values with NA if the col3 value in the same row is NA:

dat$prob3 = ifelse(is.na(dat$col3), NA, dat$prob3))

Upvotes: 1

Related Questions