Reputation: 1867
How to calculate a matrix of distances between observations grouped by some treatment.
Example of data:
set.seed(1212)
df <- data.frame(yta = c(rep("A", 3), rep("B", 3)), x = rnorm(6, 10, 2), y = rnorm(6, 40, 12))
the output I want obtain is sth like this
1 2 3 4 5 6
1 4.234690 25.858459 0 0 0
2 4.234690 23.503327 0 0 0
3 25.858459 23.503327 0 0 0
4 0 0 0 9.330203 9.277692
5 0 0 0 9.330203 18.371015
6 0 0 0 9.277692 18.371015
Upvotes: 3
Views: 615
Reputation: 2584
Using split
, lapply
and bind_rows
we can obtain something like this:
df %>% split(., (.)$yta) %>% lapply(function(df_part) df_part %>% select(-yta) %>% as.matrix %>%
dist(upper = T) %>% as.matrix %>% as.data.frame) %>%
bind_rows %>% mutate(yta = df$yta)
# Source: local data frame [6 x 7]
#
# 1 2 3 4 5 6 yta
# 1 0.000000 28.43909 4.350409 NA NA NA A
# 2 28.439090 0.00000 32.038712 NA NA NA A
# 3 4.350409 32.03871 0.000000 NA NA NA A
# 4 NA NA NA 0.00000 20.267301 29.106135 B
# 5 NA NA NA 20.26730 0.000000 9.116934 B
# 6 NA NA NA 29.10614 9.116934 0.000000 B
Upvotes: 2