R - for each observation in a column, find the closest one in another column

Question

I'm trying to filter my dataframe to keep only the rows that meet the following condition:

For each day AND each price_1, keep only the row where price_2 is the closest to price_1, and if two rows are at equal distance, take the mean of the 2 prices and volatilies. For example :

 Date              price_2        price_1   Volat
 2011-07-15        215            200.0     5
 2011-07-15        217            200.0     6
 2011-07-15        235            200.0     5.5
 2011-07-15        240            200.0     5.3
 2011-07-15        200            201.5     6.2
 2011-07-16        203            205.0     6.4
 2011-07-16        207            205.0     5.1


Expected output:

 Date              price_2        price_1  Volat
 2011-07-15        215            200.0      5
 2011-07-15        200            201.5     6.2
 2011-07-16        205            205.0     5.75

I started like this, but I don't know how to continue :

group_by(Date)  %>% 
which(df,abs(df$price_1-df$price_2)==min(abs(df$price_1-df$price_2)))

Thanks a lot in advance!

tmfmnk · Accepted Answer

One dplyr option could be:

df %>%
 group_by(Date, price_1) %>%
 mutate(diff = abs(price_2 - price_1)) %>%
 filter(diff == min(diff)) %>%
 summarise_at(vars(price_2, Volat), mean)

  Date       price_1 price_2 Volat
              
1 2011-07-15    200      215  5   
2 2011-07-15    202.     200  6.2 
3 2011-07-16    205      205  5.75

R - for each observation in a column, find the closest one in another column

Answers (2)

Related Questions