jslefche
jslefche

Reputation: 4599

Select last non-NA value in a row, by row

I have a data frame where each row is a vector of values of varying lengths. I would like to create a vector of the last true value in each row.

Here is an example data frame:

df <- read.table(tc <- textConnection("
   var1    var2    var3    var4
     1       2       NA      NA
     4       4       NA      6
     2       NA      3       NA                
     4       4       4       4              
     1       NA      NA      NA"), header = TRUE); close(tc)

The vector of values I want would therefore be c(2,6,3,4,1).

I just can't figure out how to get R to identify the last value.

Any help is appreciated!

Upvotes: 11

Views: 9369

Answers (4)

Ma&#235;l
Ma&#235;l

Reputation: 51914

A dplyr alternative is to use coalesce and reverse the order of the selected columns:

library(dplyr)
df |> 
  mutate(var5 = coalesce(var4, var3, var2, var1))

#   var1 var2 var3 var4 var5
# 1    1    2   NA   NA    2
# 2    4    4   NA    6    6
# 3    2   NA    3   NA    3
# 4    4    4    4    4    4
# 5    1   NA   NA   NA    1

To make use of tidyselection, one can create an auxiliary function coacross to use coalesce with across, and use rev to reverse the order of the names:

coacross <- function(...) {
  coalesce(!!!across(...))
}

df |> 
  mutate(var5 = coacross(rev(everything())))

Upvotes: 0

IRTFM
IRTFM

Reputation: 263301

Here's another version that removes all infinities, NA, and NaN's before taking the first element of the reversed input:

apply(df, 1, function(x) rev(x[is.finite(x)])[1] )
# [1] 2 6 3 4 1

Upvotes: 0

Frank
Frank

Reputation: 66819

Here's an answer using matrix subsetting:

df[cbind( 1:nrow(df), max.col(!is.na(df),"last") )]

This max.col call will select the position of the last non-NA value in each row (or select the first position if they are all NA).

Upvotes: 2

Andrie
Andrie

Reputation: 179398

Do this by combining three things:

  • Identify NA values with is.na
  • Find the last value in a vector with tail
  • Use apply to apply this function to each row in the data.frame

The code:

lastValue <- function(x)   tail(x[!is.na(x)], 1)

apply(df, 1, lastValue)
[1] 2 6 3 4 1

Upvotes: 19

Related Questions