user2468043
user2468043

Reputation: 35

Combine two boxplots from different dataframes in R

I would like to add in the same boxplot graph an additional boxplot from a different dataframe. I have a first dataframe with a continous variable grouped by a factor variable with 5 levels. I create a boxplot with this code

boxplot (mydata$height~ mydata$group, xlab="", ylab="", ylim=c(120, 260),  col= c("chartreuse1", "gray87", "gray87", "gray87", "gray87"), las=1)

Now I need to add in the same graph another boxplot from another data.frame that represents the range of normality.My first try was to add this code:

boxplot (control$height, add=TRUE)

Data from the vars for the first dataframe are:

Height         group
1   160.5401     IC
2   152.1736     IC
6   135.2394     IC
7   138.8716     IC
8   150.3041     IC
9   163.8295     IC
10  141.1793     IC
11  152.1263     IC
12  175.3540     IC
13  133.9237     IP
14  131.2115     IP
15  134.8984     IP
16  134.2888     IP
17  132.0721     IP
18  131.6538     IP
19  134.0276     IP
20  140.5256     IP
21  135.6092     IP
24  141.6863     IP
25  165.4456     TC
26  238.7608     TC
27  162.2336     TC
28  197.7274     TC
29  163.0832     TC

However, the control boxplot is added above the first group and it is a mess...

Any help will be appreciated.

Thanks

Upvotes: 2

Views: 1837

Answers (1)

r2evans
r2evans

Reputation: 160447

The simplest is likely to use the at argument, even though it does have at least one minor "glitch":

## using your "mydata"
boxplot(Height ~ group, data=mydata,
        xlim=c(0.5, 1.5 + length(unique(mydata$group))))
boxplot(mydata$Height, at=1 + length(unique(mydata$group)), add=TRUE)

(I generalized settting xlim in cases where more than 3 levels are found.)

The problem is that the fourth boxplot is not the same width as the others. You can play with the use of boxwex in either call to compensate for this (unfortunately, using the same value for both boxwex will not give the same width).

Edit:

boxplot sets the x-axis range from 1 to the number of boxplots, defined by your ~ group in the formula syntax. It takes this range and extends it by 0.5 in each direction. In your example of 3 groups, it normally spans from 0.5 to 3.5, demonstrated by:

par('usr')
## [1]  0.38  3.62  9.46 34.84

showing x ranges from 0.38 to 3.62 (and y is from 9.46 to 34.84). (By default, R expands both axes by 4% in either direction, so 0.5 - 0.4*(3.5-0.5) and 3.5 + 0.4*(3.5-0.5) (or, more succinctly c(0.5, 3.5) + c(-1, 1)*0.04*(3.5-0.5)). (See ?par and look for xaxs for a reference.)

Since you want to add one more boxplot to this, we must force boxplot to be wider than it normally needs to be. I could have hard-coded xlim=c(0, 4.5), but I prefer not putting in "magic constants", so I generalized it based on the data.

  1. length(unique(mydata$group)) provides the number of groups;
  2. Add 1 to this to account for the additional boxplot; and
  3. Add another 0.5 to this to bump it out a little bit more (accounting for the boxplot width).

We don't need to set this in the second call because boxplot only sets the axes when add=FALSE (the default) and it is creating a new frame.

Upvotes: 1

Related Questions