user1007742
user1007742

Reputation: 571

boxplot of vectors with different length

I have a matrix of 2 columns. I would like boxplot each of these columns but each has different number of entries.

For example, first column has 10 entries and the second column has 7 entries. The remaining 3 of the second column is given zero.

I would like to plot these side by side for comparison reasons.

Is there a way to tell R to boxplot the whole column 1 and only the first 7 entry for column 2?

Upvotes: 8

Views: 18775

Answers (2)

Gavin Simpson
Gavin Simpson

Reputation: 174778

You could simply index the values you want, for example

## dummy version of your data
mat <- matrix(c(1:17, rep(0, 3)), ncol = 2)

## create object suitable for plotting with boxplot
## I.e. convert to melted or long format
df <- data.frame(values = mat[1:17],
                 vars = rep(c("Col1","Col2"), times = c(10,7)))

## draw the boxplot
boxplot(values ~ vars, data = df)

In the above I'm taking you at your word that you have a matrix. If you actually have a data frame then you would need

df <- data.frame(values = c(mat[,1], mat[1:7, 2]),
                 vars = rep(c("Col1","Col2"), times = c(10,7)))

and I assume that the data in the two columns are comparable in that the fact that the values are in two columns suggests a categorical variable that allows us to split the values (like Height of men and women, with sex as the categorical value).

The resulting boxplot is shown below

enter image description here

Upvotes: 11

Ivan Z
Ivan Z

Reputation: 1592

For any number of columns and any number of empty entries you can do like this.

## Load data from CSV; first row contains column headers
dat <- read.csv( 'your-filename.csv', header = T )

## Set plot region (when set 'ylim' skip first row with headers)
plot(
  1, 1, 
  xlim=c(1,ncol(dat)), ylim=range(dat[-1,], na.rm=T), 
  xaxt='n', xlab='', ylab=''
)
axis(1, labels=colnames(dat), at=1:ncol(dat))

for(i in 1:ncol(dat)) {
  ## Get i-th column
  p <- dat[,i]

  ## Remove 0 values from column
  p <- p[! p %in% 0]
  ## Instead of 0 you can use any values
  ## For example, you can remove 1, 2, 3
  ##   p <- p[! p %in% c(1,2,3)]

  ## Draw boxplot
  boxplot(p, add=T, at=i)
}

This code loads table form CSV files, remove 0 values from the column (or you can remove any other values), and draw all boxplot for every column in one graphic.

Thinks this helps.

Upvotes: 3

Related Questions