Rungchat Amnuay
Rungchat Amnuay

Reputation: 11

How to limit the number of bars on x axis by count of values

I'm a new in R-programming for data analysis.

I trying to create my project with dataset name "all_trip_v2" from public datasets

Preview of my dataset

I aim to create a barchart to show only top 10 of Total count of each "start_station_name" and show in a bar chart with ggplot2 + geom_bar() and show the proportion of member type(member_casual)

I run this code

ggplot(all_trips_v2, aes(start_station_name,
                         fill = member_casual)) + 
  geom_bar()

The result from the code

As you can see, The result have a lots of bar grouped by "start_station_name". I just need to filter only top 10 count of start station name. Please give me some advice. Thank you so much.

I expected to create a bat like this

Expected bar chart.

Upvotes: 0

Views: 131

Answers (1)

chemdork123
chemdork123

Reputation: 13863

I don't know of a good way to directly do this in "one step", but it should be easier to follow done in two steps anyway. Step 1 = summarize your dataset by count, and Step 2 = filter dataset to include first X rows.

Here's an example with the chickwts built-in dataset

library(ggplot2)
df <- chickwts
ggplot(df, aes(feed)) + geom_bar() +
    theme_classic()

enter image description here

To only draw the top 3 bars, you could do the two-step process:

library(dplyr)
library(tidyr)
# STEP 1: summarize by feed count & arrange
df_counts <- df %>%
  count(feed) %>%  # creates column n with counts for feed
  arrange(-n)      # arrange descending by n

# STEP 2: plot with a filtered dataset
ggplot(df %>% dplyr::filter(feed %in% df_counts$feed[1:3]),
  aes(feed)) +
  geom_bar() + theme_classic()

enter image description here

For OP's case, maybe the following would work?

# STEP 1
all_summary <- all_trips_v2 %>%
  count(start_station_name) %>% arrange(-n)

# STEP 2
ggplot(
  all_trips_v2 %>%
    dplyr::filter(start_station_name %in% all_summary$start_station_name[1:10]),
  aes(start_station_name, fill = member_casual)) + 
  geom_bar()

Upvotes: 0

Related Questions