user113156
user113156

Reputation: 7107

Obtaining predictions specific to "NA" values from a column

My data looks like this:

     v1 v2 v3 v4     pred1    pred2
1  1908  5 10  2 10.000000  2.00000
2  1908  4 15  5 15.000000  5.00000
3  1908  8 14  4 14.000000  4.00000
4  1908  1 NA  9 11.230271  9.00000
5  1908  9 NA 14  9.942911 14.00000
6  1908  7  8 17  8.000000 17.00000
7  1908  6 NA  8  7.881931  8.00000

I want to extract the preds corresponding to a column which has NA values in. That is pred1 was obtained using the v3 column and I would like to extract the value 11.230271 and 9.942911 along with 7.881931 which has NA values in the v3 column.

So I will end up with something like 11.230271, 9.942911, 7.881931, 7.08919 where the last value 7.08919 came from pred2 which was built on v4 data and has an NA value at row12.

Data:

data <- structure(list(v1 = c(1908L, 1908L, 1908L, 1908L, 1908L, 1908L, 
1908L, 1908L, 1908L, 1908L, 1908L, 1908L, 1909L, 1909L, 1909L, 
1909L, 1909L, 1909L, 1909L, 1909L), v2 = c(5L, 4L, 8L, 1L, 9L, 
7L, 6L, 2L, 12L, 11L, 10L, 3L, 5L, 4L, 8L, 1L, 9L, 7L, 6L, 2L
), v3 = c(10L, 15L, 14L, NA, NA, 8L, NA, 7L, 5L, 2L, 16L, 13L, 
10L, 11L, 12L, 1L, 3L, 4L, 6L, 9L), v4 = c(2L, 5L, 4L, 9L, 14L, 
17L, 8L, 18L, 16L, 15L, 11L, NA, 3L, 1L, 1L, 10L, 12L, 13L, 7L, 
6L), pred1 = c(10, 15, 14, 11.2302713507484, 9.94291143314257, 
8, 7.88193139599341, 7, 5, 2, 16, 13, 10, 11, 12, 1, 3, 4, 6, 
9), pred2 = c(2, 5, 4, 9, 14, 17, 8, 18, 16, 15, 11, 7.0891904140478, 
3, 1, 1, 10, 12, 13, 7, 6)), class = "data.frame", row.names = c(NA, 
-20L))

Upvotes: 1

Views: 24

Answers (1)

akrun
akrun

Reputation: 887221

We can subset the columns of interest by filtering based on another column

subset(data, is.na(v3), select = c("pred1", "pred2"))

If we need the values for corresponding NA elements in 'v3', 'v4'

out <- Map(function(x, y) y[is.na(x)], data[3:4], data[5:6])
out
#$v3
#[1] 11.230271  9.942911  7.881931

#$v4
#[1] 7.08919

then, get the values as a single vector by unlistting

unlist(out, use.names = FALSE)

Or using map2 from purrr

library(purrr)
map2(data[3:4], data[5:6], ~ keep(.y, is.na(.x)))

Upvotes: 1

Related Questions