Reputation: 2443
To continue on a previous topic: Finding non-missing values between missing values
I would like to also find whether the value before the missing value is smaller, equal to or larger than the one after the missing.
To use the same example from before:
df = structure(list(FirstYStage = c(NA, 3.2, 3.1, NA, NA, 2, 1, 3.2,
3.1, 1, 2, 5, 2, NA, NA, NA, NA, 2, 3.1, 1), SecondYStage = c(NA,
3.1, 3.1, NA, NA, 2, 1, 4, 3.1, 1, NA, 5, 3.1, 3.2, 2, 3.1, NA,
2, 3.1, 1), ThirdYStage = c(NA, NA, 3.1, NA, NA, 3.2, 1, 4, NA,
1, NA, NA, 3.2, NA, 2, 3.2, NA, NA, 2, 1), FourthYStage = c(NA,
NA, 3.1, NA, NA, NA, 1, 4, NA, 1, NA, NA, NA, 4, 2, NA, NA, NA,
2, 1), FifthYStage = c(NA, NA, 2, NA, NA, NA, 1, 5, NA, NA, NA,
NA, 3.2, NA, 2, 3.2, NA, NA, 2, 1)), class = c("tbl_df", "tbl",
"data.frame"), row.names = c(NA, -20L))
rows 13, 14 and 16 having non-missing in between missing values. The output this time should be: "same", "larger" and "same" for rows 13, 14, and 16, and say "N/A" for the other rows.
Upvotes: 1
Views: 66
Reputation: 51592
A straight forward approach would be to split, convert to numeric, take the last 2 values and compare with an ifelse
statement, i.e.
sapply(strsplit(do.call(paste, df)[c(13, 14, 16)], 'NA| '), function(i){
v1 <- as.numeric(tail(i[i != ''], 2));
ifelse(v1[1] > v1[2], 'greater',
ifelse(v1[1] == v1[2], 'same', 'smaller'))
})
#[1] "same" "smaller" "same"
NOTE
I took previous answer as a given (do.call(paste, df)[c(13, 14, 16)]
)
A more generic approach (as noted by Ronak, last 2 digits will fail in some cases) would be,
sapply(strsplit(gsub("([[:digit:]])+\\s+[NA]+\\s+([[:digit:]])", '\\1_\\2',
do.call(paste, df)[c(13, 14, 16)]), ' '), function(i) {
v1 <- i[grepl('_', i)];
v2 <- strsplit(v1, '_')[[1]];
ifelse(v2[1] > v2[2], 'greater',
ifelse(v2[1] == v2[2], 'same', 'smaller')) })
#[1] "same" "smaller" "same"
Upvotes: 2