Arthur Spoon
Arthur Spoon

Reputation: 462

Vectorising function on subset of dataframe based on other columns

I have a dataframe from a psychology experiment with the time since the beginning of the experiment for each subject, and what I want is to set from that the time since the beginning of each trial for each subject. To do so I'm basically just substracting the minimum time value for each trial/subject to all the values for that same trial/subject.

I'm currently doing it with two for loops, I was just wondering if there's a way to vectorise it. What I have at the minute:

for (s in 1:max(df$Subject)){
  subject <- df[df$Subject==s,]
  for (t in 1:max(subject$TrialId)){
    trial <- subject[subject$TrialId==t,]
    start_offset <- min(trial$timestamp)
    df$timestamp[df$Subject==s & df$TrialId==t] <- df$timestamp[df$Subject==s &
                                                                df$TrialId==t]
                                                     - start_offset
  }
}

And what I would like is something like

df$timestamp <- df$timestamp - min_per_trial_per_subject(df$timestamp)

Upvotes: 0

Views: 31

Answers (1)

Gregor Thomas
Gregor Thomas

Reputation: 145765

With dplyr

library(dplyr)
df %>% group_by(Subject, TrialId) %>%
  mutate(modified_timestamp = timestamp - min(timestamp))

Should work. If it doesn't, please share a reproducible example so we can test.

Upvotes: 3

Related Questions