Vasile
Vasile

Reputation: 1017

How to select row's last values in a data frame and arrange them in a separate column?

I have a data frame of the following type:

enter image description here

I need to create a separate column that would include the last variables from each row starting with the column V9, i.e. 15:32, 13:44, 16:37, 15:31, NULL, NULL, 16:10, 16:22 etc. If it is easier, I can live with removing the empty rows (in this case 5 and 6). I tried a combination of which.max, length and apply, but the output did not make sense. So I have no idea what to do next. Thanks for help.

Upvotes: 0

Views: 72

Answers (2)

akrun
akrun

Reputation: 887038

We could use max.col. Subset the columns 'V9' to 'V11'. Then, use max.col to get the column index of elements that are not blank. In case of 'ties', there is an optional argument in 'max.col' i.e. ties.method to specify either 'first', 'last' or 'random'. The default option is 'random'. Here, I am using 'last' as the option. Then we cbind with the sequence of 'row' to create 'row/column' index and extract the values from 'dfN'.

dfN <- df1[paste0('V', 9:11)]
new <- dfN[cbind(1:nrow(dfN),max.col(dfN!='', 'last'))]
new  
#[1] "15:32" "13:44" "16:37" "15:31" ""      ""      "16:10" "16:22" "16:21"
#[10] "15:34" "16:26"

cbind(dfN, new)
#     V9   V10   V11   new
#1  15:32             15:32
#2        13:44       13:44
#3  16:37             16:37
#4  15:31             15:31
#5                         
#6                         
#7  12:07 12:32 16:10 16:10
#8  12:09 12:36 16:22 16:22
#9  12:06 12:35 16:21 16:21
#10 12:08 12:26 15:34 15:34
#11 12:35 13:00 16:26 16:26

Or we can use apply

apply(dfN, 1, function(x) if(any(x!='')) tail(x[x!=''],1) else '')
#[1] "15:32" "13:44" "16:37" "15:31" ""      ""      "16:10" "16:22" "16:21"
#[10] "15:34" "16:26"

Upvotes: 2

ulfelder
ulfelder

Reputation: 5335

This is not elegant, but it should work:

output <- rep(NA, nrow(df))
for (i in 1:nrow(df)) output[i] = rev(na.omit(t(df[i,])))[1]
unlist(output)

For each row, you transpose it into a column, omit missing values, reverse it, and then return the first value.

I used this for test data:

a <- seq(7)
b <- c(1, NA, 1, NA, 2, NA, 2)
c <- c(2, 3, NA, NA, 4, NA, NA)
df <- data.frame(rbind(a, b, c))

And here is the output of that process:

> unlist(output)
[1] 7 2 4

Upvotes: 0

Related Questions