Reputation: 1332
I would like to find the closest value to column x3 below.
data=data.frame(x1=c(24,12,76),x2=c(15,30,20),x3=c(45,27,15))
data
x1 x2 x3
1 24 15 45
2 12 30 27
3 76 20 15
So desired output will be
Closest_Value_to_x3
24
30
20
Please help. Thank you
Upvotes: 14
Views: 3987
Reputation: 26343
Use max.col(-abs(data[, 3] - data[, -3]))
to find the column positions of the closest values and use this result as part of a matrix to extract desired values from your data. The matrix is returned by cbind
col <- 3
data[, -col][cbind(1:nrow(data),
max.col(-abs(data[, col] - data[, -col])))]
#[1] 24 30 20
Upvotes: 12
Reputation: 39858
A tidyverse
solution:
data %>%
rowid_to_column() %>%
gather(var, val, -c(x3, rowid)) %>%
mutate(temp = x3 - val) %>%
group_by(rowid) %>%
filter(abs(temp) == min(abs(temp))) %>%
ungroup() %>%
select(val)
val
<dbl>
1 24
2 30
3 20
First, it adds a row ID. Second, it transforms the data from wide to long. Third, it calculates the difference between "x3" and the other variables. Finally, it groups by the row ID and keeps the rows where the absolute difference is the smallest.
Or:
data %>%
rowid_to_column() %>%
gather(var, val, -c(x3, rowid)) %>%
mutate(temp = x3 - val) %>%
group_by(rowid) %>%
filter(abs(temp) == min(abs(temp))) %>%
ungroup() %>%
pull(val)
[1] 24 30 20
Or using an approach originally proposed by @markus (it assumes that your columns are named "x"):
data %>%
mutate(temp = paste0("x", max.col(-abs(.[, -3] - .[, 3])))) %>%
rowwise() %>%
summarise(val = eval(as.symbol(temp)))
val
<dbl>
1 24.
2 30.
3 20.
First, it is assessing the column index of the variable where the absolute difference in regard to "x3" is the smallest and combines it with "x". Then, it evaluates the combination of x and column index as a variable and returns the appropriate value.
Also borrowing the idea from @markus (not assuming that your columns are named "x"):
data %>%
mutate(temp = max.col(-abs(.[, -3] - .[, 3]))) %>%
rowwise %>%
mutate(temp = names(.)[[temp]]) %>%
summarise(val = eval(as.symbol(temp)))
First, it is assessing the column index of the variable where the absolute difference in regard to "x3" is the smallest. Second, it returns the column name based on the column index. Finally, it evaluates it as a variable and returns the appropriate value.
Or a variant where you can reference the "x3" variable by its name and not by column index (the basic idea still from @markus):
data %>%
mutate(temp = max.col(-abs(.[, !grepl("x3", colnames(.))] - .[, grepl("x3", colnames(.))]))) %>%
rowwise %>%
mutate(temp = names(.)[[temp]]) %>%
summarise(val = eval(as.symbol(temp)))
Upvotes: 4
Reputation: 531
Define a function closest_to_3
that operates on a vector and returns the value in the vector that's closest to the third member:
closest_to_3 <- function(v) v[-3][which.min(abs( v[-3]-v[3] ))]
(The idiom v[-3]
deletes the 3rd member from v
.) Then apply this function to each row of your data frame:
apply(data, 1, closest_to_3)
#[1] 24 30 20
Upvotes: 2
Reputation: 5281
Here is another approach using matrixStats
x <- as.matrix(data[,-3L])
y <- abs(x - .subset2(data, 3L))
x[matrixStats::rowMins(y) == y]
# [1] 24 30 20
Or in base
using vapply
x <- as.matrix(data[,-3L])
y <- abs(x - .subset2(data, 3L))
vapply(1:nrow(data),
function(k) x[k,][which.min(y[k,])],
numeric(1))
# [1] 24 30 20
Upvotes: 3