UIyer
UIyer

Reputation: 21

Can I use dplyr::mutate to operate on different rows?

I have a dataset, I'm including (a small subset) of the relevant columns below,

year ID type result  
2003 1   new        closed  
2003 2   new        transferred  
2003 3   subsequent closed  
2003 4   subsequent diverted  
....  
2015 1000 new       closed

What I want to calculate is the fraction of subsequents, (no. of subsequents/(no.subsequents +no. of news) grouped by year and result, like so:

year result subsequent_frac  
2003 closed 0.10  
2003 transferred 0.05  
2003 ....  
....  
2015 closed 0.05  
2015 transferred 0.1  

I know I can do in in steps, with a group_by and summaries to get the counts and and do each result separately.... I was wondering if there was a neater/faster way to do this.

Upvotes: 2

Views: 142

Answers (1)

Valter Beaković
Valter Beaković

Reputation: 3250

Is this what you are looking for? Applying summarise removes one level of grouping, therefore the second group_by.

dfSummarized <- group_by(df, year, type) %>% 
            summarise(subsequent_frac = n()) %>% 
            #group_by(type) %>% # maybe you don't need this?
            mutate(freq = subsequent_frac / sum(subsequent_frac))

Upvotes: 1

Related Questions