Reputation: 1476
I have the following two data frames:
dataframe 1:
Class Total AC
A 1000 0.6
A 965 0.34
B 1025 0.9
B 1002 0.37
B 684 0.55
C 896 0.77
C 927 0.86
C 1000 0.61
C 955 0.69
dataframe 2:
Class Total Coverage
A 925 0.6
A 744 0.94
A 1000 0.38
B 581 0.68
B 488 0.25
B 698 0.66
C 1020 0.33
C 845 0.18
C 1555 0.66
What I want is to only take the AC
value for the first two rows for each class from the dataframe 2 and also the Coverage
value from the first two rows for the same class from the dataframe 1 and combine them together as the following:
Class AC Coverage
A 0.6 0.6
A 0.34 0.94
B 0.9 0.68
B 0.37 0.25
C 0.77 0.33
C 0.86 0.18
Note that it is always guaranteed that there are at least two rows for each class in both data frames.
Do you know how I can do that?
Upvotes: 1
Views: 34
Reputation: 11150
Here's a way using dplyr
-
df1 %>%
group_by(Class) %>%
mutate(rn = row_number()) %>%
ungroup() %>%
filter(rn %in% 1:2) %>%
inner_join(
df2 %>%
group_by(Class) %>%
mutate(rn = row_number()) %>%
ungroup() %>%
filter(rn %in% 1:2),
by = c("Class", "rn")
) %>%
select(Class, AC, Coverage)
Upvotes: 2
Reputation: 7147
Will this work?
Merge the two data frames together first.
df <- merge(df1$AC, df2$Coverage, by = "Class")
Secondly slice
the top two results from each group within a Class:
library(dplyr)
df <- df %>%
group_by(Class) %>%
slice(2)
Upvotes: 2