pisistrato
pisistrato

Reputation: 395

Multiple histogram plots using facet_wrap

I have a data frame that looks like this

x <- data.frame("raw_A" = runif(20, 2, 10), "raw_B" = runif(20, 2, 10), "mod_A" = runif(20, 2, 10), "mod_B" = runif(20, 2, 10), "modmod_A" = runif(20, 2, 10), "modmod_B" = runif(20, 2, 10), "raw_C"
    = runif(20, 2, 10), "raw_D" = runif(20, 2, 10), "mod_C" = runif(20, 2, 10), "mod_D" = runif(20, 2, 10), "modmod_C" = runif(20, 2, 10), "modmod_D" = runif(20, 2, 10), "raw_E" = runif(20, 2, 10), "raw_F" = runif(20, 2, 10), "mod_E" = runif(20, 2, 10), "mod_F" = runif(20, 2, 10), "modmod_E" = runif(20, 2, 10), "modmod_F" = runif(20, 2, 10))

What I would like to do it so use ggplot to plot a series of histograms

geom_histogram(position = "identity", alpha = 0.8, bins = 100)

(A vs B, C vs D, and E vs F).

Using facet_wrap I want in the first column A vs B, in the second column C vs D and in the third E vs F.

At the same time in the first row I want raw_, in the second row mod_ and in the last row modmod_

such as

raw_A vs raw_B       |       raw_C vs raw_D       |       raw_E vs raw_F

mod_A vs mod_B       |       mod_C vs mod_D       |       mod_E vs mod_F

modmod_A vs modmod_B |    modmod_C vs modmod_D    |    modmod_E vs modmod_F

How can I do this?

Upvotes: 1

Views: 3576

Answers (1)

Prradep
Prradep

Reputation: 5716

As you would require only specific combinations of the variables, it would be better to do it selectively. One option could be generating the dataset as required:

df <- rbind(data.frame(x=x$raw_A, y=x$raw_B, comb='raw_A vs raw_B'),
            data.frame(x=x$raw_C, y=x$raw_D, comb='raw_C vs raw_D'),
            data.frame(x=x$raw_E, y=x$raw_F, comb='raw_E vs raw_F'),

            data.frame(x=x$mod_A, y=x$mod_B, comb='mod_A vs mod_B'),
            data.frame(x=x$mod_C, y=x$mod_D, comb='mod_C vs mod_D'),
            data.frame(x=x$mod_E, y=x$mod_F, comb='mod_E vs mod_F'),

            data.frame(x=x$modmod_A, y=x$modmod_B, comb='modmod_A vs modmod_B'),
            data.frame(x=x$modmod_C, y=x$modmod_D, comb='modmod_C vs modmod_D'),
            data.frame(x=x$modmod_E, y=x$modmod_F, comb='modmod_E vs modmod_F')
            )

Then plotting using the facet variable comb created using the required combinations

ggplot(df, aes(x, y)) + geom_point() + facet_wrap(~comb)

enter image description here


The distribution of values generated in your example are strictly within the range 2-10, generated randomly using runif(20, 2, 10). But, in other scenarios, if the distribution of the variables are not strict like this, you can use the scales option.

Assume you have generated the data for the below variables with different ranges and the rest same as in original dataset.

        "modmod_A" = runif(20, 2, 6), "modmod_B" = runif(20, 2, 6), 
        "modmod_C" = runif(20, 2, 6), "modmod_D" = runif(20, 2, 6), 
        "modmod_E" = runif(20, 2, 6), "modmod_F" = runif(20, 2, 6)

You can see the difference in the below two plots.

ggplot(df, aes(x, y)) + geom_point() + facet_wrap(~comb)

enter image description here

ggplot(df, aes(x, y)) + geom_point() + facet_wrap(~comb, scales="free")

enter image description here

Upvotes: 1

Related Questions