Reputation: 2081
I want to normalize using min/max the values of two indicators. Is it possible to do it keeping the tibble in long format? (Below I use left join to do it in wide format).
library(tidyverse)
df <- tibble(ind =c(1, 2),
`2015` = c(3,10),
`2016` = c(7,18),
`2017` = c(1,4))
# long format
df2 <- df %>%
gather("year", "value", 2:4)
df3 <- df2 %>%
group_by(ind) %>%
summarise(mn = min(value),
mx = max(value))
# wide format?
df4 <- left_join(df2, df3, by = c("ind"="ind"))
df5 <- df4 %>%
mutate(value2 = (value-mn)/(mx-mn))
Created on 2019-10-07 by the reprex package (v0.3.0)
Upvotes: 1
Views: 169
Reputation: 887128
Instead of doing the left_join
, can create the columns with mutate
and avoid the summarise
step
library(dplyr)
df2 %>%
group_by(ind) %>%
mutate(mn = min(value), mx = max(value)) %>%
ungroup %>%
mutate(value2 = (value - mn)/(mx-mn))
NOTE: Here, we assumed the OP wanted the columns 'mx', 'mn' in the final output. But, if the intention is to get only 'value2', there is no need for creating additional columns as @Gregor mentioned in the comments
df2 %>%
group_by(ind) %>%
mutate(value2 = (value - min(value))/(max(value) - min(value)))
Also, with the tidyr_1.0.0
, instead of gather
, can use pivot_longer
which is more generalized as it can deal with multiple sets of columns to reshape from 'wide' to 'long'
library(tidyr)
df %>%
pivot_longer(cols = -ind) %>%
group_by(ind) %>%
mutate(mn = min(value), mx = max(value)) %>%
ungroup %>%
mutate(value2 = (value - mn)/(mx-mn))
Upvotes: 2