Florent Rumiano
Florent Rumiano

Reputation: 43

How to apply several functions on every possible row combinations within a dataframe in R?

I've got a dataframe with coordinates (lon, lat)

    lon <- list(505997.627175236, 505997.627175236, 505997.627175236, 505997.627175236)   
    lon <- do.call(rbind.data.frame, lon)

    lat <- list(7941821.025438220, 7941821.025438220, 7941821.025438220, 7941821.025438220)
    lat <- do.call(rbind.data.frame, lat)

    coord <- cbind(lon, lat)
    colnames(coord) <- c("lon", "lat")

I'm trying to calculate the euclidian distance and the angle between all the possible row combinations within the dataframe.

     lon   lat       apply function on every possible combinations such as v1-v2, v1-v3, v1-v4,
v1   x1    y1        v2-v3 and so on...
v2   x2    y2         
v3   x3    y3        here are the two functions applied beetween v1 and v2 :
v4   x4    y4        **euclidian distance**    sqrt((x1-x2)^2 + (y1-y2)^2)
                     **angle**                 atan2((y1-y2),(x1-x2))*(180/pi)

How to apply several functions on every possible row combinations and get the results in respective lists ? My goal is to use these calculations at every iteration whatever the number of row in input.

Thanks in advance for your answers and sorry if the question seems silly. I've looked at so many posts but couldn't find a solution that I could understand and replicate.

Upvotes: 4

Views: 127

Answers (4)

Rui Barradas
Rui Barradas

Reputation: 76402

Base R function combn generates the combinations of a vector's elements taken m at a time and it can, optionally, apply a function FUN to those combinations. Since the input data is a "data.frame", I will combine the rownames 2 by 2.

euclidean <- function(k){
  f <- function(x, y) sqrt((x[1] - y[1])^2 + (x[2] - y[2])^2)
  x <- unlist(coord[k[1], 1:2])
  y <- unlist(coord[k[2], 1:2])
  f(x, y)
}

angle <- function(k){ 
  f <- function(x, y) atan2(x[2] - y[2], x[1] - y[1])*(180/pi)
  x <- unlist(coord[k[1], 1:2])
  y <- unlist(coord[k[2], 1:2])
  f(x, y)
}

combn(rownames(coord), 2, euclidean)
#[1]   4019.95 800062.50  20012.25 804067.26  24001.87 780073.39

combn(rownames(coord), 2, angle)
#[1] -84.28941  90.71616  87.99547  90.74110  89.28384 -89.21407

Data.

This is the data in the OP's answer but without the id column.

lon <- c(505997.627175236, 505597.627175236,
         515997.627175236, 505297.627175236)   
lat <- c(7941821.025438220, 7945821.025438220,
         7141821.025438220, 7921821.025438220)
coord <- data.frame(lon, lat)

Upvotes: 3

Florent Rumiano
Florent Rumiano

Reputation: 43

In the end, I've adapted the code that Georgery provided but I used "combn" instead of "expand.grid" in order to avoid repetition among the row conbinations when applying the functions to the final dataframe. I also had to use the function "convert" from the package "hablar" in order to properly convert the factors of my dataframe "coord_combn" to numeric values.

Here's the code :

lon <- c(505997.627175236, 505597.627175236, 515997.627175236, 505297.627175236)   
lat <- c(7941821.025438220, 7945821.025438220, 7141821.025438220, 7921821.025438220)

# dataframe creation + adding of an id column
coord <- data.frame(lon, lat) %>% 
                 mutate(id = 1:nrow(.))

coord_combn <- combn(rownames(coord), 2) # all the possible row combinations
coord_combn <- as.data.frame(t(coord_combn)) # transpose columns into rows
coord_combn <- coord_combn %>% 
                 convert(num(V1, V2)) # factor to numeric

#join our dataframe first on one index then on the other
coord_final <- coord_combn %>%
  left_join(coord, by = c("V1" = "id")) %>%
  left_join(coord, by = c("V2" = "id"))

eDistance <- function(x1, x2, y1, y2) sqrt((x1-x2)^2 + (y1-y2)^2)
eAngle <- function(x1, x2, y1, y2) atan2((y1-y2),(x1-x2))*(180/3.14159265359)

# euclidean distance calculation
coord_final <- coord_final %>% 
                 mutate(distance = eDistance(lon.x, lon.y, lat.x, lat.y)) 
# angle calculation
coord_final <- coord_final %>% 
                 mutate(angle = eAngle(lon.x, lon.y, lat.x, lat.y)) 

Thank you everyone, you've been a great help.

Upvotes: 0

chinsoon12
chinsoon12

Reputation: 25225

For fast Euclidean calculations, you can look at this

For the other function, you can do something like

atan2(outer(coord$lat, coord$lat, `-`), outer(coord$lon, coord$lon, `-`))*180/pi

Upvotes: 0

Georgery
Georgery

Reputation: 8117

# two vectors (I changed them a little bit)
lon <- c(505997.627175236, 505597.627175236, 515997.627175236, 505297.627175236)   
lat <- c(7941821.025438220, 7945821.025438220, 7141821.025438220, 7921821.025438220)

# a function for the euclidean distance
eDistance <- function(x1, x2, y1, y2) sqrt((x1-x2)^2 + (y1-y2)^2)

# now we create a dataframe...
df <- data.frame(lon, lat) %>%
    mutate(joinIndex = 1:nrow(.)) # and we add an index column

# ...that looks like this
#        lon     lat joinIndex
# 1 505997.6 7941821         1
# 2 505597.6 7945821         2
# 3 515997.6 7141821         3
# 4 505297.6 7921821         4

# create all combinations of the join indeces
df_combinations <- expand.grid(1:nrow(df), 1:nrow(df))

#    Var1 Var2
# 1     1    1
# 2     2    1
# 3     3    1
# 4     4    1
# 5     1    2
# 6     2    2
# 7     3    2
# 8     4    2
# 9     1    3
# 10    2    3
# 11    3    3
# 12    4    3
# 13    1    4
# 14    2    4
# 15    3    4
# 16    4    4

# and join our dataframe first on one index then on the other
df_final <- df_combinations %>%
    left_join(df, by = c("Var1" = "joinIndex")) %>%
    left_join(df, by = c("Var2" = "joinIndex"))

# and then finally calculate the euclidean distance
df_final %>%
    mutate(distance = eDistance(lon.x, lon.y, lat.x, lat.y))

   Var1 Var2    lon.x   lat.x    lon.y   lat.y  distance
1     1    1 505997.6 7941821 505997.6 7941821      0.00
2     2    1 505597.6 7945821 505997.6 7941821   4019.95
3     3    1 515997.6 7141821 505997.6 7941821 800062.50
4     4    1 505297.6 7921821 505997.6 7941821  20012.25
5     1    2 505997.6 7941821 505597.6 7945821   4019.95
6     2    2 505597.6 7945821 505597.6 7945821      0.00
7     3    2 515997.6 7141821 505597.6 7945821 804067.26
8     4    2 505297.6 7921821 505597.6 7945821  24001.87
9     1    3 505997.6 7941821 515997.6 7141821 800062.50
10    2    3 505597.6 7945821 515997.6 7141821 804067.26
11    3    3 515997.6 7141821 515997.6 7141821      0.00
12    4    3 505297.6 7921821 515997.6 7141821 780073.39
13    1    4 505997.6 7941821 505297.6 7921821  20012.25
14    2    4 505597.6 7945821 505297.6 7921821  24001.87
15    3    4 515997.6 7141821 505297.6 7921821 780073.39
16    4    4 505297.6 7921821 505297.6 7921821      0.00

Upvotes: 2

Related Questions