conorfrailey
conorfrailey

Reputation: 43

normalize ggplot histogram so that first height is 1 (to show growth) in R

I was wondering if there is a way to normalize the heights of the histograms with multiple groups so that their first heights are all = 1. For instance:

results <- rep(c(1,1,2,2,2,3,1,1,1,3,4,4,2,5,7),3)
category <- rep(c("a","b","c"),15)
data <- data.frame(results,category)
p <- ggplot(data, aes(x=results, fill = category, y = ..count..))
p + geom_histogram(position = "dodge")

gives a regular histogram with 3 groups. Also

results <- rep(c(1,1,2,2,2,3,1,1,1,3,4,4,2,5,7),3)
category <- rep(c("a","b","c"),15)
data <- data.frame(results,category)
p <- ggplot(data, aes(x=results, fill = category, y = ..ncount..))
p + geom_histogram(position = "dodge")

gives a the histogram where each group is normalized to have maximum height of 1. I want to get a histogram where each group is normalized to have first height of 1 (so I can show growth) but I don't understand if there is an appropriate alternative to ..ncount or ..count.. or if anyone can help me understand the structure of ..count.. I could maybe figure it out from there. Thanks!

Upvotes: 4

Views: 1395

Answers (2)

Henrik
Henrik

Reputation: 67778

I bet there is a nice way to do everything within ggplot. However, I tend to prefer preparing the desired data set before I plug it into ggplot. If I understood you correctly, you may try something like this:

# convert 'results' to factor and set levels to get an equi-spaced 'results' x-axis
df$results <- factor(df$results, levels = 1:7)

# for each category, count frequency of 'results' 
df <- as.data.frame(with(df, table(results, category)))

# normalize: for each category, divide all 'Freq' (heights) with the first 'Freq'
df$freq2 <- with(df, ave(Freq, category, FUN = function(x) x/x[1]))

ggplot(data = df, aes(x = results, y = freq2, fill = category)) +
  geom_bar(stat = "identity", position = "dodge")

enter image description here

Upvotes: 2

colcarroll
colcarroll

Reputation: 3682

It looks like ..density.. does what you want, but I can't for the life of me find documentation on it. On both your examples it does what you are looking for, though!

results <- rep(c(1,1,2,2,2,3,1,1,1,3,4,4,2,5,7),3)
category <- rep(c("a","b","c"),15)
data <- data.frame(results,category)
p <- ggplot(data, aes(x=results, fill = category, y = ..density..))
p + geom_histogram(position = "dodge")

Upvotes: 0

Related Questions