Elsybelsy
Elsybelsy

Reputation: 71

How to find detect duplicates of single values in all rows and columns in R data.frame

I have a large data-set consisting of a header and a series of values in that column. I want to detect the presence and number of duplicates of these values within the whole dataset.

1     2     3     4     5     6     7
734  456   346   545   874   734   455
734  783   482   545   456   948   483

So for example, it would detect 734 3 times, 456 twice etc.

I've tried using the duplicated function in r but this seems to only work on rows as a whole or columns as a whole. Using

duplicated(df)

doesn't pick up any duplicates, though I know there are two duplicates in the first row.

So I'm asking how to detect duplicates both within and between columns/rows.

Cheers

Upvotes: 0

Views: 57

Answers (2)

Carles
Carles

Reputation: 2829

You can transform it to a vector and then use table() as follows:

library(data.table)
library(dplyr)
df<-fread("734  456   346   545   874   734   455
734  783   482   545   456   948   483")

df%>%unlist()%>%table()
# 346 455 456 482 483 545 734 783 874 948 
# 1   1   2   1   1   2   3   1   1   1 

Upvotes: 1

ThomasIsCoding
ThomasIsCoding

Reputation: 101044

You can use table() and data.frame() to see the occurrence

data.frame(table(v))

such that

     v Freq
1    1    1
2    2    1
3    3    1
4    4    1
5    5    1
6    6    1
7    7    1
8  346    1
9  455    1
10 456    2
11 482    1
12 483    1
13 545    2
14 734    3
15 783    1
16 874    1
17 948    1

DATA

v <- c(1, 2, 3, 4, 5, 6, 7, 734, 456, 346, 545, 874, 734, 455, 734, 
783, 482, 545, 456, 948, 483)

Upvotes: 2

Related Questions