Danilo Imbimbo
Danilo Imbimbo

Reputation: 207

Difference between two higher numbers in a column in R

I have a data frame like these:

  NUM_TURNO CODIGO_MUNICIPIO SIGLA_PARTIDO     SHARE
1         1            81825           PPB 38.713318
2         1            81825          PMDB 61.286682
3         1            09717          PMDB 48.025900
4         1            09717            PL  1.279217
5         1            09717           PFL 50.694883
6         1            61921          PMDB 51.793868

This is a data.frame of elections in Brazil. Grouping by NUM_TURNO and CODGIDO_MUNICIPIO I want to compare the SHARE of the FIRST and SECOND most votted politics in each city and round (1 or 2) and create a new column.
What am I having problem to do? I don't know how to calculate the difference only for the two biggest SHARES of votes.

For the first case, for example, I want to create something that gives me the difference between 61.286682 and 38.713318 = 22.573364 and so on.

Something like this:

df %>%
    group_by(NUM_TURNO, CODIGO_MUNICIPIO) %>%
    mutate(Diff = HIGHER SHARE - 2º HIGHER SHARE))

Upvotes: 1

Views: 31

Answers (2)

Jake Kaupp
Jake Kaupp

Reputation: 8072

You can also use top_n from dplyr with grouping and summarizing. Keep in mind that in the data you provided, you will get an error in summarize if you use diff with a single value, hence the use of ifelse.

df %>%
  group_by(NUM_TURNO, CODIGO_MUNICIPIO) %>%
  top_n(2, SHARE) %>% 
  summarize(Diff = ifelse(n() == 1, NA, diff(SHARE)))

# A tibble: 3 x 3
# Groups:   NUM_TURNO [?]
  NUM_TURNO CODIGO_MUNICIPIO  Diff
      <dbl>            <dbl> <dbl>
1         1             9717  2.67
2         1            61921 NA   
3         1            81825 22.6 

Upvotes: 2

FloSchmo
FloSchmo

Reputation: 748

You could arrange your dataframe by Share and then slice the first two values. Then you could use summarise to get the diff between the values for every group:

library(dplyr)
df %>%
    group_by(NUM_TURNO, CODIGO_MUNICIPIO) %>%
    arrange(desc(Share)) %>%
    slice(1:2) %>%
    summarise(Diff = -diff(Share))

Upvotes: 1

Related Questions