ChinookJargon
ChinookJargon

Reputation: 99

Trouble with ggplot2 position = "fill"

I am having trouble with creating a position = "fill" graph and was wondering where I was going wrong. I found some data from a wikipedia page and created a dataframe out of it:

#Vitamin Contents in %DV of common cheeses per 100gm
Cheese <- c("Swiss", "Feta", "Cheddar","Mozarella", "Cottage")
A <- c(17, 8, 20, 14, 3)
B1 <- c(4, 10, 2, 2, 2)
B2 <- c(17, 50, 22, 17,10)
B3 <- c(0,5,0,1,0)
B5 <- c(4, 10, 4, 1, 6)Rplot
B6 <- c(4, 21, 4, 2, 2)
B9 <- c(1, 8, 5, 2, 3)
B12 <- c(56, 28, 14, 38, 7)
Ch <- c(2.8, 2.2, 3, 2.8, 3.3)
C <- c(0, 0, 0, 0, 0)
D <- c(11, 0, 3, 0, 0)
E <- c(2, 1, 1, 1, 0)

Cheese_Vitamins <- data.frame(Cheese, A, B1, B2, B3, B5, B6, B9, B12, Ch, C, D, E)

I then converted the data.frame from a wide to long format using the gather function from the tidyr package`:

long.cheese.vit <- gather(Cheese_Vitamins, Vitamins, Percentage, c(A, B1, B2, B3, B5, B6, B9, B12,Ch, C, D, E))

However when I try to graph a filled chart:

ggplot(long.cheese.vit, aes(Cheese, fill = Percentage)) +
  geom_bar(position = "fill")

I don't get the result that I am looking for:

Cheese Vitamin content

What I am trying to create is a filled bargraph with the vitamin content broken down by percentage for each cheese. Any suggestions would be helpful. Thank you!

Upvotes: 0

Views: 996

Answers (2)

neilfws
neilfws

Reputation: 33802

The values for each vitamin are percentages of daily value, per vitamin. So it is not appropriate to stack the values, or to fill them as proportions of 100%.

One option is to fill by vitamin but to dodge the bars. I'm using rvest to grab the data straight from the Wikipedia table:

library(rvest)
library(tidyr)
library(ggplot2)

read_html("https://en.wikipedia.org/wiki/Cheese#Nutrition_and_health") %>%
  html_node("#mw-content-text > div > table:nth-child(98)") %>% 
  html_table() %>%
  gather(vitamin, percentage, -Cheese) %>% 
  ggplot(aes(Cheese, percentage)) + 
    geom_col(aes(fill = vitamin), position = "dodge")

enter image description here

This is OK but with 13 vitamins, the colours become difficult to distinguish. A better option might be to use facets for the vitamins.

read_html("https://en.wikipedia.org/wiki/Cheese#Nutrition_and_health") %>%
  html_node("#mw-content-text > div > table:nth-child(98)") %>% 
  html_table() %>%
  gather(vitamin, percentage, -Cheese) %>% 
  ggplot(aes(Cheese, percentage)) + 
    geom_col() + 
    facet_grid(vitamin ~ .)

enter image description here

You might also consider reversing the facets using facet_grid(. ~ vitamin), that results in rather cluttered x-axis labels but makes vitamin comparison easier. So perhaps fill by cheese and remove the labels:

read_html("https://en.wikipedia.org/wiki/Cheese#Nutrition_and_health") %>%
  html_node("#mw-content-text > div > table:nth-child(98)") %>% 
  html_table() %>%
  gather(vitamin, percentage, -Cheese) %>% 
  ggplot(aes(Cheese, percentage)) + 
    geom_col(aes(fill = Cheese)) + 
    facet_grid(. ~ vitamin) + 
    theme(axis.text.x = element_blank())

enter image description here

Upvotes: 3

ccapizzano
ccapizzano

Reputation: 1616

You need to properly identify the Percentage column as a y value and use the stat argument in geom_bar (not position). Yet, your percentages per cheese group don't add up to 100% so I will update my answer once I know more...

ggplot(long.cheese.vit, aes(x=Cheese,y=Percentage,fill=Vitamins)) + geom_bar(stat="identity")

enter image description here

Upvotes: 1

Related Questions