Reputation: 361
I am trying to find the two first names ("First Name" column) with the biggest difference in values ("Action" column). Does anyone know how to do that? Thanks in advance!
data = structure(list(`First Name` = c("Till", "Roland", "Otmar", "Christoph",
"Bianca"), Action = c(2, 1, 2, 1, 5), Reflection = c(6, 7, 6,
7, 3), Flexibility_Thinking = c(2, 3, 3, 1, 6), Structure = c(6,
4, 4, 7, 2)), row.names = c(NA, -5L), class = c("tbl_df", "tbl",
"data.frame"))
Upvotes: 1
Views: 252
Reputation: 39667
You can use dist
where you can choose with method
how the distance should be calculated (euclidean
, maximum
, manhattan
, canberra
, binary
or minkowski
).
x <- as.matrix(dist(data$Action)) * lower.tri(diag(data$Action))
matrix(data$"First Name"[which(x == max(x), TRUE)], ncol=2)
# [,1] [,2]
#[1,] "Bianca" "Roland"
#[2,] "Bianca" "Christoph
Or for multiple columns at the same time.
x <- as.matrix(dist(data[-1])) * lower.tri(diag(data$Action))
matrix(data$"First Name"[which(x == max(x), TRUE)], ncol=2)
#[1,] "Bianca" "Christoph"
Upvotes: 3
Reputation: 389012
Here is one base R approach -
#Get pairwise differences for all names
mat <- abs(outer(data$Action, data$Action, `-`))
#get the max difference
max_values <- apply(mat, 1, max)
#get the index where the max difference is present
max_index <- apply(mat, 1, which.max)
#Create a dataframe with first name, name of biggest difference person
#and the difference value
result <- cbind(data[1],
biggest_diff = data$`First Name`[max_index], diff = max_values)
result
# First Name biggest_diff diff
#1 Till Bianca 3
#2 Roland Bianca 4
#3 Otmar Bianca 3
#4 Christoph Bianca 4
#5 Bianca Roland 4
#get top 2 results
head(result[order(-result$diff), ], 2)
# First Name biggest_diff diff
#2 Roland Bianca 4
#4 Christoph Bianca 4
Upvotes: 1