Mengll
Mengll

Reputation: 317

Plotting histogram with percentages in ggplot2

I am trying to plot a histogram using ggplot2 with percentage on the y-axis and numerical values on the x-axis.

A sample of my data and script looks like this (below) and goes on for about 100,000 rows (or more).

A    B
0.2  x
1    y
0.995    x
0.5  x
0.5  x
0.2  y
ggplot(data, aes(A, colour=B)) + geom_bar() +stat_bin(breaks=seq(0,1, by=0.05)) + scale_y_continuous(labels = percent)

I want to know the percentage of B values distributed in each bin of A value, instead of the number of B values per A value.

The code as it is now gives me a y-axis with ymax as 15000. The y-axis is supposed to be in percentages (0-100).

Upvotes: 0

Views: 2355

Answers (1)

Henrik
Henrik

Reputation: 67778

Is this what you want? I assume your data frame is called df:

# calculate proportions of B for each level of A
df2 <- as.data.frame(with(df, prop.table(table(A, B))))
df2
#       A B      Freq
# 1   0.2 x 0.1666667
# 2   0.5 x 0.3333333
# 3 0.995 x 0.1666667
# 4     1 x 0.0000000
# 5   0.2 y 0.1666667
# 6   0.5 y 0.0000000
# 7 0.995 y 0.0000000
# 8     1 y 0.1666667

ggplot(data = df2, aes(x = A, y = Freq, fill = B)) +
geom_bar(stat = "identity", position = position_dodge())

enter image description here

Upvotes: 2

Related Questions