FredPurnell
FredPurnell

Reputation: 13

R - Share of Records by Day of Week?

For those familiar, I'm working on the Coursera bike share case study . . .

I'm working in R and have a data frame that is a log of all rides taken with a bicycle ride share company. A simplified version of the data frame is below.

started_at ended_at weekday member_casual
2022-11-10 06:21:55 2022-11-10 06:31:27 Thursday member
2022-11-04 07:31:55 2022-11-04 07:46:25 Friday member
2022-11-21 17:20:29 2022-11-21 17:34:36 Monday casual
2022-11-25 17:29:34 2022-11-25 17:45:15 Friday member

I'd like to create a data frame (or better yet a visual - maybe a bar chart?) that shows share of rides by day. So for example:

weekday member casual
Monday 12% 6%
Tuesday 15% 9%
Wednesday 15% 10%
Thursday 13% 14%
Friday 14% 18%
Saturday 14% 20%
Sunday 17% 23%

bar chart

What would be the easiest way to accomplish this?

Thanks!

I tried creating a data frame but wasn't sure sure how to get percent share of whole by member_casual

weekday_by_member <- p12m %>%
    filter(member_casual=="member") %>%
    count(weekday) 
sorted_weekday_by_member <- arrange(weekday_by_member,levels = c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday","Saturday"))

sorted_weekday_by_member


weekday_by_casual <- p12m %>%
    filter(member_casual=="casual") %>%
    count(weekday) 
sorted_weekday_by_casual <- arrange(weekday_by_casual,levels = c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday","Saturday"))

sorted_weekday_by_casual


weekday_mem_cas <- p12m %>%
    group_by (member_casual) %>%
    count(weekday) 

weekday_mem_cas

results

Upvotes: 0

Views: 82

Answers (3)

AnilGoyal
AnilGoyal

Reputation: 26218

Do this. Needless to say I am using it on one dataset only


df %>% 
  group_by(weekday = lubridate::wday(starttime, label = T), usertype) %>% 
  summarise(trips = n(), .groups = 'drop') %>% 
  mutate(percent = scales::percent(trips/sum(trips)), .by = usertype) %>% 
  ggplot(aes(x = weekday, y = trips, fill = usertype)) +
  geom_col(position = "dodge") +
  geom_text(aes(label = percent), position = position_dodge(width = 0.9), vjust = -0.9) 



df %>% 
  group_by(weekday = lubridate::wday(starttime, label = T), usertype) %>% 
  summarise(trips = n(), .groups = 'drop') %>% 
  mutate(percent = trips/sum(trips), .by = usertype) %>% 
  ggplot(aes(x = weekday, y = percent, fill = usertype)) +
  geom_col(position = "dodge") +
  geom_text(aes(label = scales::percent(percent)), position = position_dodge(width = 0.9), vjust = -0.9) +
  scale_y_continuous(labels = scales::percent)

Created on 2024-02-05 with reprex v2.0.2

Upvotes: 0

Hoel
Hoel

Reputation: 729

library(tidyverse)
library(scales)

df <- read_csv("202004-divvy-tripdata.csv") 

df %>% 
  mutate(across(started_at, ~ lubridate::wday(.x, 
                                              label = T, 
                                              week_start = 1, 
                                              abbr = FALSE))) %>% 
  ggplot() + 
  aes(x = started_at, fill = member_casual) + 
  geom_bar(position = position_dodge()) +
  geom_text(aes(label = percent(after_stat(proportions(count)))),
            stat = "count", 
            position = position_dodge(width = 0.9), 
            vjust = -1)

enter image description here

Upvotes: 0

jay.sf
jay.sf

Reputation: 72883

You can do this quite concisely using xtabs, barplot, and proportions for the percent labels.

> xtb <- xtabs(~ member_casual + weekday, dat)
> prp <- paste0(proportions(xtb)*100, '%')
> 
> b <- xtb |> 
+   barplot(beside=TRUE, col=c(4, 2), ylim=c(0, max(xtb) + 2), leg=rownames(xtb))
> text(b, xtb + 1, labels=prp, cex=.8)

enter image description here


Data:

> set.seed(42)
> n <- 100
> s <- sample(seq.POSIXt(as.POSIXct('2022-01-01'), as.POSIXct('2022-12-31'), 'secs'), n)
> dat <- data.frame(
+   started_at=s,
+   ended_at=s + sample.int(600, length(s))
+ ) |> 
+   transform(weekday=strftime(dat$started_at, '%A'),
+             member_casual=c('member', 'casual'))

Upvotes: 0

Related Questions