Reputation: 919
I have one data.frame (Data) and a subset of this data.frame (Data2)
set.seed(1)
Data <- data.frame(id = seq(1, 10),
Diag1 = sample(c("A123", "B123", "C123"), 10, replace = TRUE),
Diag2 = sample(c("D123", "E123", "F123"), 10, replace = TRUE),
Diag3 = sample(c("G123", "H123", "I123"), 10, replace = TRUE),
Diag4 = sample(c("A123", "B123", "C123"), 10, replace = TRUE),
Diag5 = sample(c("J123", "K123", "L123"), 10, replace = TRUE),
Diag6 = sample(c("M123", "N123", "O123"), 10, replace = TRUE),
Diag7 = sample(c("P123", "Q123", "R123"), 10, replace = TRUE))
Data2 <- Data[1:4,]
How do I get the "difference" of both data.frames? I am looking for the rows which are in Data but not in Data2.
I thought something like this Data[!Data2] should have worked but it didn't.
Thank you!
Upvotes: 1
Views: 139
Reputation: 3622
This will solve your exact problem here, but it can probably be generalized using the count
function from plyr
library(plyr)
df <- as.data.frame(rbind(Data, Data2)) # rbind data sets
df <- count(df, vars = names(df)) # count frequency of rows
subset(df, freq < 2) # subset the data.frame when freq < 2
Upvotes: 1
Reputation: 55340
data.table keys are your (best!) friend
library(data.table)
Data <- as.data.table(Data)
Data2 <- as.data.table(Data2)
## set whichever cols make sense as keys
setkey(Data, Diag1, Diag2, Diag3)
## or to set all columns as key, use
# setkey(Data)
## Same key for Data2
setkey(Data2, Diag1, Diag2, Diag3)
## or
# setkeyv(Data2, key(Data)) # <~ Note: Use setkeyv for strings
Data[!.(Data2)]
id Diag1 Diag2 Diag3 Diag4 Diag5 Diag6 Diag7
1: 5 A123 F123 G123 C123 K123 M123 Q123
2: 10 A123 F123 H123 B123 L123 N123 R123
3: 9 B123 E123 I123 C123 L123 N123 P123
4: 6 C123 E123 H123 C123 L123 M123 P123
5: 7 C123 F123 G123 C123 J123 M123 Q123
Upvotes: 4
Reputation: 12875
I think you're using data.table
constructs on data.frame
. This should work instead -
library(data.table)
Data <- data.table(Data)
Data2 <- data.table(Data2)
setkeyv(Data,colnames(Data))
setkeyv(Data2,colnames(Data2))
Data[!Data2]
Upvotes: 5