ammar khalil
ammar khalil

Reputation: 41

Box-plot with respect to 2 factor variables

I have one numeric variable and two factor variables one with two levels and other with three levels. I want to construct a box plot split in a way that the graph appears as the box plots with two levels of first factor based on one level of second factor variable. And so forth for all levels of second factor.

set.seed(100)
x <- rnorm(n = 500, mean = 25, sd = 5)
status <- sample(c(rep(x = "paid", 218), rep("non-paid", 282)))
category <- sample(c(rep("action", 193), rep("product", 129), rep("inspiration", 178)))
df <- data.frame(x, status, category)

boxplot(df$x ~ df$status[df$category == "action"])

However. It gives the error that variable lengths differ.

Upvotes: 0

Views: 655

Answers (1)

IRTFM
IRTFM

Reputation: 263451

You either need to use a data argument (possibly accompanied by a subset argument, or have identical selection rules on both sides of the formula:

boxplot(df$x[df$category == "action"] ~ df$status[df$category == "action"])

Or:

boxplot( x ~ status , data= df[ df$category == "action", ])

Or:

 boxplot( x ~ status , data= df, subset = (category == "action") )

A nice way to get all four values of category would be to use an interaction term on the RHS:

boxplot(x ~ interaction( status, category), data=df)

Upvotes: 1

Related Questions