Reputation: 519
Below is a data frame of four columns, the requirement is that for all common "a1" and "a2" column value pairs, I want to study the corresponding ascending order of the a3 columns and set the corresponding column a4 values in ascending order. E.g. See the value "A" and "M" in "a1" and "a2", if you check the order of corresponding a3 values for "A" and "M" i.e.(10,32,13), I want the corresponding a4 values to be arranged in the order (5,55,23) Similarly for all similar pairs. Thanks and please help.
a1 = c("A", "B", "C", "A", "B", "B", "A", "C", "A", "C", "B")
a2 = c("M", "O", "N" , "N", "O", "N", "M", "M", "M", "N", "O")
a3 = c(10, 34, 35 , 56 , 44 , 37 , 32 , 15 , 13 , 12, 37)
a4 = c(55, 6, 3, 213, 478, 233, 23, 14, 5, 7, 74)
a123 = data.frame(a1, a2, a3, a4)
Upvotes: 1
Views: 2829
Reputation: 16121
Here's a dplyr
solution using the rank/order of a3
values within each a1
,a2
combination:
a1 = c("A", "B", "C", "A", "B", "B", "A", "C", "A", "C", "B")
a2 = c("M", "O", "N" , "N", "O", "N", "M", "M", "M", "N", "O")
a3 = c(10, 34, 35 , 56 , 44 , 37 , 32 , 15 , 13 , 12, 37)
a4 = c(55, 6, 3, 213, 478, 233, 23, 14, 5, 7, 74)
# datasets
a123 = data.frame(a1, a2, a3)
a124 = data.frame(a1, a2, a4)
library(dplyr)
# add the rank to each dataset
# based on the order of the values in a3 and a4
a123 = a123 %>%
group_by(a1,a2) %>%
mutate(r = rank(a3)) %>%
ungroup()
a124 = a124 %>%
group_by(a1,a2) %>%
mutate(r = rank(a4)) %>%
ungroup()
# join datasets based on a1, a2 combinations and the rank
inner_join(a123, a124, by=c("a1","a2","r"))
# # A tibble: 11 x 5
# a1 a2 a3 r a4
# <fctr> <fctr> <dbl> <dbl> <dbl>
# 1 A M 10.0 1.00 5.00
# 2 B O 34.0 1.00 6.00
# 3 C N 35.0 2.00 7.00
# 4 A N 56.0 1.00 213
# 5 B O 44.0 3.00 478
# 6 B N 37.0 1.00 233
# 7 A M 32.0 3.00 55.0
# 8 C M 15.0 1.00 14.0
# 9 A M 13.0 2.00 23.0
# 10 C N 12.0 1.00 3.00
# 11 B O 37.0 2.00 74.0
You can remove column r
if you want. I left it there so you can easily see how it works.
Based on the above philosophy/approach you can also use purrr
and dplyr
like this:
a1 = c("A", "B", "C", "A", "B", "B", "A", "C", "A", "C", "B")
a2 = c("M", "O", "N" , "N", "O", "N", "M", "M", "M", "N", "O")
a3 = c(10, 34, 35 , 56 , 44 , 37 , 32 , 15 , 13 , 12, 37)
a4 = c(55, 6, 3, 213, 478, 233, 23, 14, 5, 7, 74)
# datasets
a123 = data.frame(a1, a2, v=a3)
a124 = data.frame(a1, a2, v=a4)
library(dplyr)
library(purrr)
list(a123, a124) %>%
map(. %>%
group_by(a1,a2) %>%
mutate(r = rank(v)) %>%
ungroup()) %>%
reduce(inner_join, by=c("a1","a2","r"))
# # A tibble: 11 x 5
# a1 a2 v.x r v.y
# <fctr> <fctr> <dbl> <dbl> <dbl>
# 1 A M 10.0 1.00 5.00
# 2 B O 34.0 1.00 6.00
# 3 C N 35.0 2.00 7.00
# 4 A N 56.0 1.00 213
# 5 B O 44.0 3.00 478
# 6 B N 37.0 1.00 233
# 7 A M 32.0 3.00 55.0
# 8 C M 15.0 1.00 14.0
# 9 A M 13.0 2.00 23.0
# 10 C N 12.0 1.00 3.00
# 11 B O 37.0 2.00 74.0
Upvotes: 3
Reputation: 39174
This is almost the same as your previous question (Sorting a column based on the order of another column in R). The difference is that there are two grouping variables, a1
and a2
. Fortunately, the group_by
function can take more than one columns. So the solution is below. Notice that I set ties.method = "first"
. You may what to change this to other tie method if you want a different rules to break the ties.
library(dplyr)
a123_r <- a123 %>%
group_by(a1, a2) %>%
mutate(a4 = sort(a4, decreasing = TRUE)[rank(-a3, ties.method = "first")]) %>%
ungroup() %>%
as.data.frame()
a123_r
# a1 a2 a3 a4
# 1 A M 10 5
# 2 B O 34 6
# 3 C N 35 7
# 4 A N 56 213
# 5 B O 44 478
# 6 B N 37 233
# 7 A M 32 55
# 8 C M 15 14
# 9 A M 13 23
# 10 C N 12 3
# 11 B O 37 74
Upvotes: 0