ECII
ECII

Reputation: 10619

Plot frequency of a value of 2 factors in the same plot in R

I'd like to plot the frequency of a variable color coded for 2 factor levels for example blue bars should be the hist of level A and green the hist of level B both n the same graph? Is this possible with the hist command? The help of hist does not allow for a factor. Is there another way around?

I managed to do this by barplots manually but i want to ask if there is a more automatic method

enter image description here

Many thanks EC

PS. I dont need density plots

Upvotes: 2

Views: 2594

Answers (5)

Atticus29
Atticus29

Reputation: 4412

Just in case the others haven't answered this is a way that satisfies. I had to deal with stacking histograms recently, and here's what I did:

data_sub <- subset(data, data$V1 == "Yes") #only samples that have V1 as "yes" in my dataset #are added to the subset

hist(data$HL)
hist(data_sub$HL, col="red", add=T)

Hopefully, this is what you meant?

Upvotes: 1

Tyler Rinker
Tyler Rinker

Reputation: 109864

I agree with the others that a density plot is more useful than merging colored bars of a histogram, particularly if the group's values are intermixed. This would be very difficult and wouldn't really tell you much. You've got some great suggestions from others on density plots, here's my 2 cents for density plots that I sometimes use:

y <- rnorm(1000, 0, 1) 
x <- rnorm(1000, 0.5, 2) 
DF <- data.frame("Group"=c(rep(c("y","x"), each=1000)), "Value"=c(y,x))

library(sm)

with(DF, sm.density.compare(Value, Group, xlab="Grouping"))
title(main="Comparative Density Graph")
legend(-9, .4, levels(DF$Group), fill=c("red", "darkgreen")) 

Upvotes: 1

IRTFM
IRTFM

Reputation: 263332

It's rather unclear what you have as a data layout. A histogram requires that you have a variable that is ordinal or continuous so that breaks can be created. If you also have a separate grouping factor you can plot histograms conditional on that factor. A nice worked example of such a grouping and overlaying a density curve is offered in the second example on the help page for the histogram function in the lattice package. Second lattice::histgram example

A nice resource for learning relative merits of lattice and ggplot2 plotting is the Learning R blog. This is from the first of a multipart series on side-by=side comparison of the two plotting systems:

library(lattice)
 library(ggplot2)
 data(Chem97, package = "mlmRev")
#The lattice method:
pl <- histogram(~gcsescore | factor(score), data = Chem97)
 print(pl)

Lattice histogram

# The ggplot method:
 pg <- ggplot(Chem97, aes(gcsescore)) + geom_histogram(binwidth = 0.5) +
     facet_wrap(~score)
 print(pg)

enter image description here

Upvotes: 3

Tyler Rinker
Tyler Rinker

Reputation: 109864

It's very possible.

I didn't have data to work with but here's an example of a histogram with different colored bars. From here you'd need to use my code and figure out how to make it work for factors instead of tails.

BASIC SETUP histogram <- hist(scale(vector)), breaks= , plot=FALSE) plot(histogram, col=ifelse(abs(histogram$breaks) < #of SD, Color 1, Color 2))

#EXAMPLE
x<-rnorm(1000)
histogram <- hist(scale(x), breaks=20 , plot=FALSE)
plot(histogram, col=ifelse(abs(histogram$breaks) < 2, "red", "green"))

Upvotes: 2

Tomas
Tomas

Reputation: 59465

I don't think you can do that easily with a bar histogram, as you would have to "interlace" the bars from both factor levels... It would need some kind of "discretization" of the now continuous x axis (i.e. it would have to be split in "categories" and in each category you would have 2 bars, for each factor level...

But it is quite easy and without problems if you are just fine with plotting the density line function:

y <- rnorm(1000, 0, 1)
x <- rnorm(1000, 0.5, 2)
dx <- density(x)
dy <- density(y)
plot(dx, xlim = range(dx$x, dy$x), ylim = range(dx$y, dy$y), 
     type = "l", col = "red")
lines(dy, col = "blue")

enter image description here

Upvotes: 1

Related Questions