Reputation: 355
I have the following data
df <- data.frame(val1=c(1.2,0.5,3.8,2.5,7.4),
val2=c(1.2,2.5,3.8,2.5,2.4),
val3=c(1.2,2.5,3.6,2.5,7.4),
val4=c(1.2,2.5,3.8,2.5,4.4),
val5=c(1.2,2.5,3.8,2.9,7.4))
I'd like to find which field is different to the rest. Expecting a result like below to be added to the data frame
cbind(df,results = c("all_equal", "val1","val3","val5","morethan1"))
Is there any way to do this in an easy way? I have an extensive loop to get this result which I won't post here. I'm looking for a quick solution that I've missed to see (maybe using dplyr)
Upvotes: 1
Views: 66
Reputation: 887098
Here is one option with tidyverse
by reshaping the data into 'long' format, apply the conditions to create the 'result' column and then bind the column with the original dataset
library(tidyverse)
rownames_to_column(df, 'rn') %>%
gather(key, val, matches('^val')) %>%
group_by(rn) %>%
mutate(Mode = Modes(val)) %>%
summarise(result = case_when(all(val == Mode) ~ "all_equal",
sum(val != Mode) > 1 ~ "morethan1",
TRUE ~ paste0("val", which(val != Mode), collapse=","))) %>%
select(result) %>%
bind_cols(df, .)
# val1 val2 val3 val4 val5 result
#1 1.2 1.2 1.2 1.2 1.2 all_equal
#2 0.5 2.5 2.5 2.5 2.5 val1
#3 3.8 3.8 3.6 3.8 3.8 val3
#4 2.5 2.5 2.5 2.5 2.9 val5
#5 7.4 2.4 7.4 4.4 7.4 morethan1
The Modes
function
Modes <- function(x) {
ux <- unique(x)
tab <- tabulate(match(x, ux))
ux[tab == max(tab)]
}
Upvotes: 1
Reputation: 39154
First, define a function to calculate the mode. I used the function found here: https://stackoverflow.com/a/8189441/7669809
Modes <- function(x) {
ux <- unique(x)
tab <- tabulate(match(x, ux))
ux[tab == max(tab)]
}
After that, we can use the following code to get the desired output.
apply(df, 1, function(x){
x_mode <- Modes(x)
if (all(x == x_mode)){
return("all_equal")
} else if (sum(x != x_mode) > 1){
return("morethan1")
} else {
ind <- which(x != x_mode)
return(paste0("val", ind))
}
})
# [1] "all_equal" "val1" "val3" "val5" "morethan1"
Upvotes: 1