Tim Palmer
Tim Palmer

Reputation: 11

I'm trying to get average rides by hour weekday vs weekend

https://www.kaggle.com/timothypalmer/divvy-capstone-project/edit

I know there is a better way, but I'm new at this. I have hourly output aggregated, but I need to

  1. separate data by weekday/weekend
  2. get average rides per day
    • divide weekday output by 5
    • divide weekend output by 2

I've looked at ways to do this separately, but I think there should be a way I can do this side by side. I think I need to make 2 separate new variables, average_rides_weekend, average_rides_weekday and then graph them somehow. Newbie issues.

rides_by_hour <- data_year3%>% 
  group_by(start_hour, member_casual, is_weekend) %>% 
  summarise(number_of_rides = n(), .groups = 'drop') 
  
ggplot(data = rides_by_hour) + 
  geom_bar(mapping = aes(x = start_hour, y = number_of_rides, fill = member_casual), stat='identity', position = "dodge") + 
  facet_wrap(~is_weekend, labeller = labeller(is_weekend = days.labs)) + 
  labs(title="Number of rides for casual and member riders during a day",
       x="Hours during a day",
       y="Number of rides", subtitle="Aug2022 to July2023") + 
  theme(legend.position="top") + 
  scale_fill_discrete(name = "Rider: ")

Upvotes: 1

Views: 40

Answers (1)

Qasim Alhammad
Qasim Alhammad

Reputation: 61

and welcome to the community, as r2evans mentioned a reproducible example will helps us to see your data and try your code and then provide you with the working code.

Also you kaggle notebook is private so we can't see it, if you can go to settings and change it to public we will be able to have a look at it.

However from what I understand the below code would help you and provide you with the output you need. I didn't check the rest of ggplot code so I kept it as it is.

rides_by_hour <- data_year3%>% 
  group_by(start_hour, is_weekend) %>% 
  summarise(avg_rides = mean(column name for rides), .groups = 'drop') 

## this will create an avg rides column for weekend and weekdays for each start_hour
  
ggplot(data = rides_by_hour) + 
  geom_bar(mapping = aes(x = start_hour, y = avg_rides, fill = member_casual), stat='identity', position = "dodge") + 
  facet_wrap(~is_weekend, labeller = labeller(is_weekend = days.labs)) + 
  labs(title="Number of rides for casual and member riders during a day",
       x="Hours during a day",
       y="Number of rides", subtitle="Aug2022 to July2023") + 
  theme(legend.position="top") + 
  scale_fill_discrete(name = "Rider: ")

Upvotes: 0

Related Questions