Jdan
Jdan

Reputation: 149

Delete tail of data by group in R

I have a data frame similar to

df <- data.frame(group=c("a", "b"), value=1:16,trim=rep(1:2))

I am trying to figure out how I can remove the last rows of each group. The number of rows to remove from each group is defined in the "trim" variable.
I have figured out how to remove a specified number of of rows from all groups using

x<-do.call("rbind", lapply(split(df, df$group), head,-2))

However, I can't seem to figure how I'd remove the number of rows from a group specified in the "trim" column. In other words, I would like group a to have the last row trimmed and group b the last 2 rows trimmed.

Upvotes: 3

Views: 1314

Answers (3)

eipi10
eipi10

Reputation: 93861

Using dplyr:

library(dplyr)

df %>% group_by(group) %>% slice(1:(n() - trim[1]))  # Per @42-, this is faster than unique(trim)
    group value  trim
1       a     1     1
2       a     3     1
3       a     5     1
4       a     7     1
5       a     9     1
6       a    11     1
7       a    13     1
8       b     2     2
9       b     4     2
10      b     6     2
11      b     8     2
12      b    10     2
13      b    12     2

Upvotes: 2

IRTFM
IRTFM

Reputation: 263451

Try to pull first value within group:

x<-do.call("rbind", lapply(split(df, df$group), function(d) head(d,-d$trim[1]) ) )

Normally I test my answers but doing this from an iPhone on a bouncing train.

Upvotes: 5

lmo
lmo

Reputation: 38510

Here is a method using data.table (borrowing from @42's method):

library(data.table)
setDT(df)
df[, head(.SD, -trim[1]), by=group]

Which outputs:

    group value trim
 1:     a     1    1
 2:     a     3    1
 3:     a     5    1
 4:     a     7    1
 5:     a     9    1
 6:     a    11    1
 7:     a    13    1
 8:     b     2    2
 9:     b     4    2
10:     b     6    2
11:     b     8    2
12:     b    10    2
13:     b    12    2

Upvotes: 2

Related Questions