ccamara
ccamara

Reputation: 1225

Relative Y values in ggplot instead of absolute

Provided this dataframe obtained from a questionnaire made to people from different neighborhoods, I'd like to create a barplot showing the degree of identification per neighborhood.

In fact I managed to do it with the following code:

library(ggplot2)
df = read.csv("http://pastebin.com/raw.php?i=77QPBc5T")

ggplot(df,
       aes(x = factor(Identificación.con.el.barrio),
           fill = Nombre.barrio)
) +
  geom_histogram(position="dodge") +
  ggtitle("¿Te identificas con tu barrio?") +
  labs(x="Grado de identificación con el barrio", fill="Barrios")

Resulting in the following plot: enter image description here

However, since each neighborhood has a different number of population, the sample per neighborhood is also really different (eg: Arcosur has only 24 respondants whereas Arrabal has 69) and thus, the results may be misleading (see below)

library(dplyr)

df = tbl_df(df)

df %>%
  group_by(Nombre.barrio) %>%
  summarise(Total = n())

Source: local data frame [10 x 2]

   Nombre.barrio Total
1       Almozara    68
2        Arcosur    24
3        Arrabal    69
4       Bombarda    20
5       Delicias    68
6          Jesús    69
7      La Bozada    32
8    Las fuentes    64
9         Oliver    68
10      Picarral    68

For this reason I'd like to have relative values on y axis, displaying the % of respondants per neighborhood that answered each one of the possible answers. Unfortunately I don't have any idea on how to achieve this, since I am pretty new to R.

Upvotes: 2

Views: 2309

Answers (1)

scoa
scoa

Reputation: 19867

library(ggplot2)
library(dplyr)
df = read.csv("http://pastebin.com/raw.php?i=77QPBc5T")

df = tbl_df(df)

d <- df %>%
  group_by(Nombre.barrio,Identificación.con.el.barrio) %>%
  summarise(Total = n()) %>%
  mutate(freq=Total/sum(Total))

ggplot(d,
       aes(x = factor(Identificación.con.el.barrio),
           y=freq,
           fill = Nombre.barrio)
) +
  geom_bar(position="dodge",stat="identity") +
  ggtitle("¿Te identificas con tu barrio?") +
  labs(x="Grado de identificación con el barrio", fill="Barrios")

enter image description here

Upvotes: 1

Related Questions