user197410
user197410

Reputation: 31

anti-join not working - giving 0 rows, why?

I am trying to use anti-join exactly as I have done many times to establish which rows across two datasets do not have matches for two specific columns. For some reason I keep getting 0 rows in the result and I can't understand why.

Below are two dummy df's containing the two columns I am trying to compare - you will see one is missing an entry (df1, SITE no2, PLOT no 8) - so when I use anti-join to compare the two dfs, this entry should be returned, but I am just getting a result of 0.

a<- seq(1:3)
SITE <- rep(a, times = c(16,15,1))
PLOT <- c(1:16,1:7,9:16,1)
df1 <- data.frame(SITE,PLOT)
SITE <- rep(a, times = c(16,16,1))
PLOT <- c(rep(1:16,2),1)
df2 <- data.frame(SITE,PLOT)

df1                 df2
SITE    PLOT        SITE    PLOT
1           1       1        1
1           2       1        2
1           3       1        3
1           4       1        4
1           5       1        5
1           6       1        6
1           7       1        7
1           9       1        8
1           10      1        9
1           11      1        10
1           12      1        11
1           13      1        12
1           14      1        13
1           15      1        14
1           16      1        15
1           1       1        16
2           2       2        1
2           3       2        2
2           4       2        3
2           5       2        4
2           6       2        5
2           7       2        6
2           8       2        7
2           9       2        8
2           10      2        9
2           11      2        10
2           12      2        11
2           13      2        12
2           14      2        13
2           15      2        14
2           16      2        15
3           1       2        16
                    3        1


a <- anti_join(df1, df2, by=c('SITE', 'PLOT'))
a

<0 rows> (or 0-length row.names)

I'm sure the answer is obvious but I can't see it.

Upvotes: 1

Views: 1020

Answers (1)

user10917479
user10917479

Reputation:

The answer can be found in the help file.

anti_join() return all rows from x without a match in y.

So reversing the input for df1 and df2 will give you what you expect.

anti_join(df2, df1, by=c('SITE', 'PLOT'))

#   SITE PLOT
# 1    2    8

Upvotes: 3

Related Questions