Reputation: 1445
I am trying to remove rows that have duplicate entries, as defined by two columns, from multiple dataframes located in a single list.
Simple data:
aa <- data.frame(a=rnorm(100),b=rnorm(100),x=rnorm(100),y=rnorm(100),Z=rep(1:4, each=25))
split.aa<-split(aa, aa$Z)
For each df in the list 'split.aa' I am trying to remove rows with duplicated x,y pairs.
I could do this one df a time with:
split[[z]][!duplicated(split[[z]][,c('x','y')]),]
where z is the name of each df within 'split.aa'.
How would I write this into lapply so that the action is performed on each element?
I am having a hard time wrapping my head around how to refer to the specific list elements within the lapply function.
Upvotes: 2
Views: 3092
Reputation: 2001
just define a function in lapply
lapply(split.aa, function(x) x[!duplicated(x[c("x", "y")]), ])
Upvotes: 1
Reputation: 81683
lapply(split.aa, function(x) x[!duplicated(x[c("x", "y")]), ])
will do the trick.
Upvotes: 3