Reputation: 1017
I have a data frame of the following type:
I need to create a separate column that would include the last variables from each row starting with the column V9
, i.e. 15:32
, 13:44
, 16:37
, 15:31
, NULL
, NULL
, 16:10
, 16:22
etc. If it is easier, I can live with removing the empty rows (in this case 5 and 6).
I tried a combination of which.max
, length
and apply
, but the output did not make sense. So I have no idea what to do next. Thanks for help.
Upvotes: 0
Views: 72
Reputation: 887038
We could use max.col
. Subset the columns 'V9' to 'V11'. Then, use max.col
to get the column index of elements that are not blank. In case of 'ties', there is an optional argument in 'max.col' i.e. ties.method
to specify either 'first', 'last' or 'random'. The default option is 'random'. Here, I am using 'last' as the option. Then we cbind
with the sequence of 'row' to create 'row/column' index and extract the values from 'dfN'.
dfN <- df1[paste0('V', 9:11)]
new <- dfN[cbind(1:nrow(dfN),max.col(dfN!='', 'last'))]
new
#[1] "15:32" "13:44" "16:37" "15:31" "" "" "16:10" "16:22" "16:21"
#[10] "15:34" "16:26"
cbind(dfN, new)
# V9 V10 V11 new
#1 15:32 15:32
#2 13:44 13:44
#3 16:37 16:37
#4 15:31 15:31
#5
#6
#7 12:07 12:32 16:10 16:10
#8 12:09 12:36 16:22 16:22
#9 12:06 12:35 16:21 16:21
#10 12:08 12:26 15:34 15:34
#11 12:35 13:00 16:26 16:26
Or we can use apply
apply(dfN, 1, function(x) if(any(x!='')) tail(x[x!=''],1) else '')
#[1] "15:32" "13:44" "16:37" "15:31" "" "" "16:10" "16:22" "16:21"
#[10] "15:34" "16:26"
Upvotes: 2
Reputation: 5335
This is not elegant, but it should work:
output <- rep(NA, nrow(df))
for (i in 1:nrow(df)) output[i] = rev(na.omit(t(df[i,])))[1]
unlist(output)
For each row, you transpose it into a column, omit missing values, reverse it, and then return the first value.
I used this for test data:
a <- seq(7)
b <- c(1, NA, 1, NA, 2, NA, 2)
c <- c(2, 3, NA, NA, 4, NA, NA)
df <- data.frame(rbind(a, b, c))
And here is the output of that process:
> unlist(output)
[1] 7 2 4
Upvotes: 0