Reputation: 197
I have a raw data as below
raw_data <- data.frame(
name=c("Ronak","Bob","Moh"),
l_name=c("Shah","Marly","Salah"),
R_programming=c(1,5,2),
football=c(2,4,6),
snooker=c(6,3,2),
Python=c(3,2,6),
location=c("Maly","US","Maly")
)
I want to group by location and then summarize to takE mean of two columns in new column say programming and sport i.e
programing = mean(c(R_programming, python), na.rm=TRUE)
sport= mean(c(football,snooker),na.rm=TRUE)
then I calculate a score and scale based on above averages and reshape data
My expected output is
output <- data.frame(
location=c("US","Maly"),
indicator=rep(c("programming","sport"),each=2),
average=c(3,3.5,3.5,4),
score=c(0.6,0.7,0.7,0.8),
scale=c("Moderate high","high","high","high")
)
Thanks from your help in advance
Upvotes: 0
Views: 198
Reputation: 388817
group_by
location
and take average of programming and sports. Get the data in long format and create new columns for score
and scale
.
library(dplyr)
raw_data %>%
group_by(location) %>%
summarise(programming_average = mean(c(R_programming, Python), na.rm = TRUE),
sport_average = mean(c(football,snooker),na.rm=TRUE)) %>%
tidyr::pivot_longer(cols = -location,
names_to = c('indicator', '.value'),
names_sep = '_') %>%
mutate(score = average/5,
scale = case_when(score <= 0.6 ~ 'moderate high',
score > 0.6 ~ 'high'))
# location indicator average score scale
# <chr> <chr> <dbl> <dbl> <chr>
#1 Maly programming 3 0.6 moderate high
#2 Maly sport 4 0.8 high
#3 US programming 3.5 0.7 high
#4 US sport 3.5 0.7 high
Upvotes: 1