Don Hessey
Don Hessey

Reputation: 67

Delete rows if 2 values exists in another data frame

I've got 2 data frames that contain the same columns. One is called newGames, the other is called existingGames. What I would like to do is check comapare the newGames game_pk and sv_id columns between the two data frames. If the game_pk and sv_id exist in both table I would like to remove that row from the newGames data frame.

        N.game_pk   N.sv_id           E.game_pk   E.sv_id
 1      440697      123456_789012     NA          NA
 2      440698      123456_789012     440698      NA
 3      440699      123456_789012     440699      123456_789012

The e. stands for the existingGames data frame and N. stands for the newGames data frame. What I would like to have in the end is a newGames data frame that shows all the columns where e.game_pk AND e.sv_id (WHOLE GAME) or e.sv_id (single pitch from game) do not exist in the existingGames data frame. Sv_id is a time stamp so it is possible to have the same value for different game_pk's, so "combining" the fields makes the row unique.

this works for missing games (440697)

 newGames2 <- newGames[!(newGames$game_pk %in% existingGames$game_pk),]

but I tried

 newGames2 <- newGames[!(newGames$game_pk %in% existingGames$game_pk) | (!(newGames$game_pk %in% existingGames$game_pk) & !(newGames$sv_id %in% existingGames$sv_id)),]

and it just selects the full data frame instead of the single missing pitch.

Upvotes: 0

Views: 87

Answers (1)

Pierre Lapointe
Pierre Lapointe

Reputation: 16277

Here's how to do that with anti_join from dplyr:

new_game <- read.table(text="game_pk   sv_id
440697      123456_789012
440698      123456_789012
440699      123456_789012",header=TRUE,stringsAsFactors=FALSE)

existing_game <- read.table(text="game_pk   sv_id
NA          NA
440698      NA
440699      123456_789012",header=TRUE,stringsAsFactors=FALSE)

library(dplyr)
anti_join(new_game,existing_game)

  game_pk         sv_id
1  440698 123456_789012
2  440697 123456_789012

Upvotes: 2

Related Questions