Reputation: 307
I have a data frame that I have split using the splitstackshape
package. After splitting I unable to proceed with grouping multiple columns and plotting a bar chart. The code is as follows,
library(tidyverse)
library(splitstackshape)
df <- data.frame(countries=(c("England","Australia,Pakistan", "India,England","Denmark", "",
"Australia, Pakistan, New Zealand, England", "United States, England,Pakistan")))
data_split <- splitstackshape::cSplit(df, "countries", ",")
data_split
The output is as follows,
countries_1 countries_2 countries_3 countries_4
1: England <NA> <NA> <NA>
2: Australia Pakistan <NA> <NA>
3: India England <NA> <NA>
4: Denmark <NA> <NA> <NA>
5: <NA> <NA> <NA> <NA>
6: Australia Pakistan New Zealand England
7: United States England Pakistan <NA>
With the above output I wish to plot a bar chart containing the frequency of countries in descending order. The sample output is as follows, bar chart showing frequency of countries in descending order
Upvotes: 1
Views: 177
Reputation: 3876
Like this:
library(tidyverse)
library(ggplot2)
df %>%
separate_rows(countries, sep = ",") %>%
count(countries) %>%
ggplot(aes(y = fct_reorder(countries, n), x = n)) +
geom_col()
Edit based on comment: plot only 10 most common countries:
df %>%
separate_rows(countries, sep = ",") %>%
count(countries) %>%
slice_max(n, n = 10) %>%
ggplot(aes(y = fct_reorder(countries, n), x = n)) +
geom_col()
Upvotes: 1