Reputation: 13
For those familiar, I'm working on the Coursera bike share case study . . .
I'm working in R and have a data frame that is a log of all rides taken with a bicycle ride share company. A simplified version of the data frame is below.
started_at | ended_at | weekday | member_casual |
---|---|---|---|
2022-11-10 06:21:55 | 2022-11-10 06:31:27 | Thursday | member |
2022-11-04 07:31:55 | 2022-11-04 07:46:25 | Friday | member |
2022-11-21 17:20:29 | 2022-11-21 17:34:36 | Monday | casual |
2022-11-25 17:29:34 | 2022-11-25 17:45:15 | Friday | member |
I'd like to create a data frame (or better yet a visual - maybe a bar chart?) that shows share of rides by day. So for example:
weekday | member | casual |
---|---|---|
Monday | 12% | 6% |
Tuesday | 15% | 9% |
Wednesday | 15% | 10% |
Thursday | 13% | 14% |
Friday | 14% | 18% |
Saturday | 14% | 20% |
Sunday | 17% | 23% |
What would be the easiest way to accomplish this?
Thanks!
I tried creating a data frame but wasn't sure sure how to get percent share of whole by member_casual
weekday_by_member <- p12m %>%
filter(member_casual=="member") %>%
count(weekday)
sorted_weekday_by_member <- arrange(weekday_by_member,levels = c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday","Saturday"))
sorted_weekday_by_member
weekday_by_casual <- p12m %>%
filter(member_casual=="casual") %>%
count(weekday)
sorted_weekday_by_casual <- arrange(weekday_by_casual,levels = c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday","Saturday"))
sorted_weekday_by_casual
weekday_mem_cas <- p12m %>%
group_by (member_casual) %>%
count(weekday)
weekday_mem_cas
Upvotes: 0
Views: 82
Reputation: 26218
Do this. Needless to say I am using it on one dataset only
df %>%
group_by(weekday = lubridate::wday(starttime, label = T), usertype) %>%
summarise(trips = n(), .groups = 'drop') %>%
mutate(percent = scales::percent(trips/sum(trips)), .by = usertype) %>%
ggplot(aes(x = weekday, y = trips, fill = usertype)) +
geom_col(position = "dodge") +
geom_text(aes(label = percent), position = position_dodge(width = 0.9), vjust = -0.9)
df %>%
group_by(weekday = lubridate::wday(starttime, label = T), usertype) %>%
summarise(trips = n(), .groups = 'drop') %>%
mutate(percent = trips/sum(trips), .by = usertype) %>%
ggplot(aes(x = weekday, y = percent, fill = usertype)) +
geom_col(position = "dodge") +
geom_text(aes(label = scales::percent(percent)), position = position_dodge(width = 0.9), vjust = -0.9) +
scale_y_continuous(labels = scales::percent)
Created on 2024-02-05 with reprex v2.0.2
Upvotes: 0
Reputation: 729
library(tidyverse)
library(scales)
df <- read_csv("202004-divvy-tripdata.csv")
df %>%
mutate(across(started_at, ~ lubridate::wday(.x,
label = T,
week_start = 1,
abbr = FALSE))) %>%
ggplot() +
aes(x = started_at, fill = member_casual) +
geom_bar(position = position_dodge()) +
geom_text(aes(label = percent(after_stat(proportions(count)))),
stat = "count",
position = position_dodge(width = 0.9),
vjust = -1)
Upvotes: 0
Reputation: 72883
You can do this quite concisely using xtabs
, barplot
, and proportions
for the percent labels.
> xtb <- xtabs(~ member_casual + weekday, dat)
> prp <- paste0(proportions(xtb)*100, '%')
>
> b <- xtb |>
+ barplot(beside=TRUE, col=c(4, 2), ylim=c(0, max(xtb) + 2), leg=rownames(xtb))
> text(b, xtb + 1, labels=prp, cex=.8)
Data:
> set.seed(42)
> n <- 100
> s <- sample(seq.POSIXt(as.POSIXct('2022-01-01'), as.POSIXct('2022-12-31'), 'secs'), n)
> dat <- data.frame(
+ started_at=s,
+ ended_at=s + sample.int(600, length(s))
+ ) |>
+ transform(weekday=strftime(dat$started_at, '%A'),
+ member_casual=c('member', 'casual'))
Upvotes: 0