Find the minimum difference between dataframe column and a vector in R

Question

Suppose I have dataframe as follows,

a = c(10,20,30,40,50, 60, 70, 80 ,90, 100) %>% data.frame() 
colnames(a) = c("column1")

and a vector,

b = c( 46, 90, 75, 15)

I want to find the closest element of b from a. The required output would be,

The following is my trying,

I am trying to add rownames to a and b and trying to create full join, and finding the difference for every combination and take the minimum difference. But adding rownames, makes the full join work only for the first four elements,

a %>% add_rownames('rowname') %>% full_join(b %>% add_rownames(rowname), by = c("rowname" = "rowname"))

This doesn't work. Can anybody help me in solving this problem?

alistaire · Accepted Answer

One option is to use outer with - to subtract all combinations of elements from each vector, producing a matrix. Rearranging to find the negative absolute value of that matrix lets you use max.col to find which index of b has the smallest difference. Subsetting b returns that value, so

a$b <- b[max.col(-abs(outer(a$column1, b, `-`)))]

returns

a
##    column1  b
## 1       10 15
## 2       20 15
## 3       30 15
## 4       40 46
## 5       50 46
## 6       60 46
## 7       70 75
## 8       80 75
## 9       90 90
## 10     100 90

You could equally work element-wise, if you prefer. In dplyr, grouping rowwise makes such an approach fairly straightforward:

library(dplyr)

a %>% rowwise() %>% mutate(b = b[which.min(abs(column1 - b))])

## Source: local data frame [10 x 2]
## Groups: 
## 
## # A tibble: 10 × 2
##    column1     b
##       
## 1       10    15
## 2       20    15
## 3       30    15
## 4       40    46
## 5       50    46
## 6       60    46
## 7       70    75
## 8       80    75
## 9       90    90
## 10     100    90

Find the minimum difference between dataframe column and a vector in R

Answers (1)

Related Questions