Mark Bower
Mark Bower

Reputation: 589

Combing data frame rows in R based on common values

Given a data frame:

    > df <- data.frame( L=c('a','b','b'), t0=c(1,10,20), t1=c(9,19,39))
    > df
      L t0 t1
    1 a  1  9
    2 b 10 19
    3 b 20 39

    I want:
    > df
        L t0 t1
      1 a  1  9
      2 b 10 39

The identical values for df$L equals "b" signify that the start (t0) of the first instance of 'b' should be the new 't0' value and the new 't1' value of the last instance of (contiguous) 'b' should be the new 't1' value. In effect, if t0 and t1 are times, then I want to merge the time durations of adjacent rows that have the same value for 'L'.

Upvotes: 3

Views: 61

Answers (4)

akrun
akrun

Reputation: 887108

After grouping by 'L', summarise to take the first value of 't0' and last value of 't1' (or min and max)

df %>%
   group_by(L) %>%
    summarise(t0 = first(t0), t1 = last(t1))
# A tibble: 2 x 3
#  L        t0    t1
#  <fct> <dbl> <dbl>
#1 a         1     9
#2 b        10    39

Based on the OP's comments, if we are also grouping by adjacent similar elements in 'L', use rleid

library(data.table)
df1 %>% 
    group_by(grp = rleid(L), L) %>%
    summarise(t0 = first(t0), t1 = last(t1))

data

df1 <- data.frame( L=c('a','b','b','a','b','b'), 
        t0=c(1,10,20,40,60,70), t1=c(9,19,39,49,69,79))

Upvotes: 4

s_baldur
s_baldur

Reputation: 33488

Using data.table:

library(data.table)
setDT(df)
df[, .(t0 = t0[1], t1 = t1[.N]), by = L]

#    L t0 t1
# 1: a  1  9
# 2: b 10 39

Upvotes: 0

ThomasIsCoding
ThomasIsCoding

Reputation: 101335

Maybe you can try aggreate and merge

res <- merge(aggregate(t0 ~ L,df,min),aggregate(t1 ~ L,df,max))

such that

> res
  L t0 t1
1 a  1  9
2 b 10 39

Upvotes: 0

GKi
GKi

Reputation: 39657

You can split by L and return the range.

df <- do.call(rbind, lapply(split(df[-1], df[1]), range))
df
#  [,1] [,2]
#a    1    9
#b   10   39

df <- data.frame(L=rownames(df), t0=df[,1], t1=df[,2])
df
#  L t0 t1
#a a  1  9
#b b 10 39

Upvotes: 3

Related Questions