Reputation: 625
How do I delete the first row of each new variable? For example, here is some data:
m <- c("a","a","a","a","a","b","b","b","b","b")
n <- c('x','y','x','y','x','y',"x","y",'x',"y")
o <- c(1:10)
z <- data.frame(m,n,o)
I want to delete the first entry for a and b in column m. I have a very large data frame so I want to do this based on the change from a to b and so on.
Here is what I want the data frame to look like.
m n o
1 a y 2
2 a x 3
3 a y 4
4 a x 5
5 b x 7
6 b y 8
7 b x 9
8 b y 10
Thanks.
Upvotes: 1
Views: 134
Reputation: 93813
Just use duplicated
:
z[duplicated(z$m),]
# m n o
#2 a y 2
#3 a x 3
#4 a y 4
#5 a x 5
#7 b x 7
#8 b y 8
#9 b x 9
#10 b y 10
Why this works? Consider:
duplicated("a")
#[1] FALSE
duplicated(c("a","a"))
#[1] FALSE TRUE
Upvotes: 6
Reputation: 19544
Using group_by
and row_number
from package dplyr:
z %>%
group_by(m) %>%
filter(row_number(o)!=1)
Upvotes: 1
Reputation: 12703
data.table is preferred for large datasets in R. setDT
converts z
data frame to data table by reference. Group by m
and remove the first row.
library('data.table')
setDT(z)[, .SD[-1], by = "m"]
Upvotes: 4