bsg
bsg

Reputation: 835

Dplyr add column to data frame based on specific value of grouped data

I have a data frame containing number of page views per week for various users. It looks like this:

Userid week views
eerr   24   1
dd     24   2
dd     25   1
...

I want to plot average page views per week. However, I want to group users by the number of page views they had in the first week so that I can plot separate trajectories for users with different activity levels. I can get the first week for each user by doing

weekdf = df %>% group_by(Userid) %>% mutate(firstweek = min(week))

But I can't figure out how to group by the value of views in the row with that first week. I tried using a user-defined function within summarise, which seemed to work, but it never terminated, and I can see why - it has to recalculate everything many times.

getoffset <- function(week, Userid,minweekdf)
{
 minweek = minweekdf[minweekdf$Userid == Userid,2] 
 offsetweek = week - minweek
 return(offsetweek)
}

offsetdf = df %>% group_by(Userid, week) %>% summarise(offsetweek = getoffset(week, Userid, minweek)) 

How can I do this, preferably in dplyr?

Upvotes: 1

Views: 2462

Answers (1)

iugrina
iugrina

Reputation: 605

Something like this:

df %>% group_by(Userid) %>% arrange(week) %>% mutate(fv = first(views) )

and then you can group by fv

Upvotes: 2

Related Questions