Subtract values within groups in R

Question

I have a dataset containing variables that give information about the voteshare of a party in a given year and district and whether or not the respective party sent a candidate to parliament, like this:

year district party voteshare candidate
2000 A        P1    50%       1
2000 A        P2    30%       0
2000 A        P3    20%       0
2000 B        P1    43%       1
2000 B        P2    21%       0
2000 B        P3    34%       0
...

Now, I want to calcuate each party's margin of loss/victory (i.e. how "close" the election was for the respective party) by substracting each party's voteshare from the winning party (the party that sent a candidate to parliament) and the winning party's voteshare from the second successful party, such that:

year district party voteshare candidate margin
2000 A        P1    50%       1         +20%
2000 A        P2    30%       0         -20%
2000 A        P3    20%       0         -30%
2000 B        P1    43%       1         +9%
2000 B        P2    21%       0         -22%
2000 B        P3    34%       0         -9%
...

I don't know how to do that with dplyr...

Ronak Shah · Accepted Answer

You can do :

library(dplyr)

df1 %>%
  #Turn voteshare to a number
  mutate(voteshare = readr::parse_number(voteshare)) %>%
  group_by(year, district) %>%
  #When candidate is sent to parliament
  mutate(margin = case_when(candidate == 1 ~ 
                            #Subtract with second highest voteshare
                            voteshare - sort(voteshare, decreasing = TRUE)[2],
                            #else subtract with voteshare of highest candidate
                            TRUE ~ voteshare - voteshare[candidate == 1]))

#   year district party voteshare candidate margin
#                   
#1  2000 A        P1           50         1     20
#2  2000 A        P2           30         0    -20
#3  2000 A        P3           20         0    -30
#4  2000 B        P1           43         1      9
#5  2000 B        P2           21         0    -22
#6  2000 B        P3           34         0     -9

data

df1 <- structure(list(year = c(2000L, 2000L, 2000L, 2000L, 2000L, 2000L
), district = c("A", "A", "A", "B", "B", "B"), party = c("P1", 
"P2", "P3", "P1", "P2", "P3"), voteshare = c("50%", "30%", "20%", 
"43%", "21%", "34%"), candidate = c(1L, 0L, 0L, 1L, 0L, 0L)), 
class = "data.frame", row.names = c(NA, -6L))

Subtract values within groups in R

Answers (2)

Related Questions