user3412205
user3412205

Reputation: 1445

Remove duplicates from list elements

I am trying to remove rows that have duplicate entries, as defined by two columns, from multiple dataframes located in a single list.

Simple data:

aa <- data.frame(a=rnorm(100),b=rnorm(100),x=rnorm(100),y=rnorm(100),Z=rep(1:4, each=25))
split.aa<-split(aa, aa$Z)

For each df in the list 'split.aa' I am trying to remove rows with duplicated x,y pairs.

I could do this one df a time with:

split[[z]][!duplicated(split[[z]][,c('x','y')]),]

where z is the name of each df within 'split.aa'.

How would I write this into lapply so that the action is performed on each element?

I am having a hard time wrapping my head around how to refer to the specific list elements within the lapply function.

Upvotes: 2

Views: 3092

Answers (2)

infominer
infominer

Reputation: 2001

just define a function in lapply

lapply(split.aa, function(x) x[!duplicated(x[c("x", "y")]), ])

Upvotes: 1

Sven Hohenstein
Sven Hohenstein

Reputation: 81683

lapply(split.aa, function(x) x[!duplicated(x[c("x", "y")]), ])

will do the trick.

Upvotes: 3

Related Questions