Reputation: 972
I'm trying to do some data summarization using R and dplyr. My data frame has many rows of the following form:
color year score
<fctr> <int> <int>
I have the same number of year rows for each of N different colors. For each of these, I have a score. Within each color (group), I'd like to compute the ratio of all of the scores to one particular year. For example:
color year score
<fctr> <int> <int>
1 blue 1980 43
2 blue 1982 13
3 red 1980 330
4 red 1998 89
I'd like to augment this frame with a new column called "ratio" which is the quotient of the score of each row within each color group (e.g., blue or red) and the score of the row with a fixed year, 1980. For example:
color year score ratio
<fctr> <int> <int>
1 blue 1980 43 1
2 blue 1982 13 0.302325581
3 red 1980 330 1
4 red 1998 89 0.269696969
I know how to use mutate
and summarize
, but it's not clear to me how to select out the score value for a given row that meets a certain condition (in this case, the row with the year 1980 (of which we are guaranteed just one)) within each group.
What's a clean way to do this?
Upvotes: 1
Views: 2036
Reputation: 972
akrun's comment answered my question:
mutate(ratio = score/score[year==1980])
is exactly what I needed here.
Upvotes: 4