Reputation: 885
I am trying to find out how to compare elements of my df1
with df2
and count their frequency. My df1
and df2
are like this:
var1 = c(1, 2, 3, 4, 5, 6, 7)
var2 = c(1, 1, 2, 3, 4, 5, 6)
value = c(0, 0.75, 0.51, 0.42, 0.31, 0.22, 0.11)
freq = c(1,1,1,1,1,1,1)
df1 = data.frame(var1, var2, value, freq)
var1 = c(1, 2, 3, 4, 5, 6, 7)
var2 = c(1, 2, 3, 5, 4, 6, 8)
value = c(0, 0.75, 0.42, 0.41, 0.31, 0, 0)
freq = c(1,1,1,1,1,1,1)
df2 = data.frame(var1, var2, value, freq)
so I would like a df3
with rows that are similar in df1
and df2
From the above example df3
would be:
var1=c(1,5)
var2=c(1,4)
value=c(0,0.31)
freq=c(1,1)
df3=data.frame(var1, var2, value, freq)
Upvotes: 0
Views: 163
Reputation: 28705
Without the frequency part this is just a merge with default settings (i.e. inner join on all variables). To get the frequency part you can use count
after grouping by all variables, then inner_join
(dplyr merge equivalent) and add the individual frequencies.
I modified df1 just to check that the count part works as intended.
merge(df1, df2)
# var1 var2 value
# 1: 1 1 0.00
# 2: 5 4 0.31
library(dplyr)
df1 <- df1[c(1, 1, seq(nrow(df1))),]
df1 %>%
group_by_all %>%
count(name = 'n1') %>%
inner_join(
df2 %>%
group_by_all %>%
count(name = 'n2')
) %>%
mutate(n = n1 + n2) %>%
select(-n1, -n2)
# # A tibble: 2 x 4
# # Groups: var1, var2, value [2]
# var1 var2 value n
# <dbl> <dbl> <dbl> <int>
# 1 1 1 0 4
# 2 5 4 0.31 2
Upvotes: 1
Reputation: 122
like this?
library(dplyr)
df3 = df1[apply(df1 == df2, 1, all), ]
df3 %>% group_by_all() %>% summarise(freq= n())
Upvotes: 0