Reputation: 3424
I have an R code that looks like this:
roundup <- function(x) {
return(as.integer(ceiling(x / 10.0)) * 10)
}
uni <- read.csv(filepath_1, header = FALSE)
end_nodes <- read.csv(filepath_2, header = FALSE)
min_end_nodes <- if (min(end_nodes$V1) == 0) 1 else min(end_nodes$V1)
max_end_nodes <- roundup(max(end_nodes$V1))
hist(uni$V1, freq = FALSE)
X11()
hist(end_nodes$V1, freq = FALSE)
h <- hist(end_nodes$V1, breaks = seq(min_end_nodes - 1, max_end_nodes, by = 1), plot = FALSE)
h$counts = h$counts / sum(h$counts)
plot(h)
X11()
min_uni <- if (min(uni$V1) == 0) 1 else min(uni$V1)
max_uni <- roundup(max(uni$V1))
h <- hist(uni$V1, breaks = seq(min_uni - 1, max_uni, by = 1), plot = FALSE)
h$counts = h$counts / sum(h$counts)
This works and creates two histograms for me, and they looks like this:
Both of the histograms have very similar distribution, and are nearly the same, though, I want to stack the two of them in one, and see where and how much they differ. Additionally, I do not want to use the plotting functions that come with R, and instead use ggplot2. I already found some similar questions in SO, such as this. But, I really couldn't manage to create something meaningful for my case. Any ideas how to use ggplot2 to stack two histogram that look like above?
EDIT:
Both of my values are composed of integers between 1 and 6, but the size of the datasets are not exactly the same, one of them has a few less values than the other. I can add some dummy 0 values to make them of the same size if that is problematic. Anyway, so the data look like uni = [2,2,1,2,2,1,1,5,3...]
, end_nodes = [1,6,6,4,3,3,2,2,2...]
.
Upvotes: 0
Views: 398
Reputation: 1438
I cannot replicate your code as I do not have your data set but would something like this work?
library(tidyverse)
dat <- data.frame(x = rnorm(10000, 4, 3),
y = rnorm(10000, 2, 2)) %>%
gather(var, value)
ggplot(dat, aes(value, fill = var)) +
geom_histogram(alpha = 0.75, position = "identity", bins = 75)
Upvotes: 4