Reputation: 3266
I have a data frame with a big number of observations without ID, but there are 3 columns that I believe they determine each observation/row (it is in this sense that I say that these columns are a superkey, using the terminology of data bases). How can I check this?
I know, that for only one column I could use a function as duplicated and look at frequencies, but how can I manage multiple columns and look for merged duplicated rows?
Thanks in advance!
Upvotes: 1
Views: 93
Reputation: 13135
Or you can use distinct
from dplyr
library(dplyr)
#nrow(distinct(df, x, y, z))==nrow(df)
distinct(df, x, y, z)
x y z
1 1 1 1
2 2 4 5
data
df <- data.frame(x=c(1,2,1),y=c(1,4,1), z=c(1,5,1))
Upvotes: 1