quartzl
quartzl

Reputation: 57

Form a destination-arrival pair from location

I am working with a transportation dataset that records departure and the arriving locations of each trip. From that, I can just use paste(arrival_to,departure_from,sep = "-") to create a route, A-B for example. However, I also want to group round-trips as one. For example, both "From A to B" and from "From B to A" should all give A-B.

The dataset looks like this:

df <- data.frame(id = c(1,1,1,2,3),
                 departure_from  = c("A","B","A","B","C"),
                 arrival_to = c("A","A","B","A","A"))
  id departure_from arrival_to
1  1              A          A
2  1              B          A
3  1              A          B
4  2              B          A
5  3              C          A

What I want is this:

  id departure_from arrival_to route
1  1              A          A   A-A
2  1              B          A   A-B
3  1              A          B   A-B
4  2              B          A   A-B
5  3              C          A   A-C

What I am doing for now is to "exploit" a fact from my dataset that a pair route, that is "A-B" and "B-A", usually have similar observation counts so I did a lenghtly summarize and arrange the count and use lag to match the with the previous line... This is prone to flaw anyway so I look forward to a more code-focused solution ...

Thank you!

Upvotes: 0

Views: 53

Answers (2)

ThomasIsCoding
ThomasIsCoding

Reputation: 101247

You can try pmin and pmax like below

transform(
  df,
  route = paste0(pmin(departure_from,arrival_to),"-",pmax(departure_from,arrival_to))
)

which gives

  id departure_from arrival_to route
1  1              A          A   A-A
2  1              B          A   A-B
3  1              A          B   A-B
4  2              B          A   A-B
5  3              C          A   A-C

Upvotes: 2

user438383
user438383

Reputation: 6206

How about this solution, sorting the pairs beforehand?

library(dplyr)
df %>% 
    rowwise() %>% 
    mutate(route = paste(sort(c(departure_from, arrival_to)), collapse="-"))
     id departure_from arrival_to route    
  <dbl> <chr>          <chr>      <chr>
1     1 A              A          A-A  
2     1 B              A          A-B  
3     1 A              B          A-B  
4     2 B              A          A-B  
5     3 C              A          A-C  

Upvotes: 1

Related Questions