Reputation: 29
I'm a beginner in R and is currently working on graph generating by R. Typically most data examples are as diamonds in ggplot2:
carat cut color clarity depth table price x y z
0.2 Ideal E SI2 61.5 55.0 326 3.95 3.98 2.43
0.2 Premium E SI1 59.8 61.0 326 3.89 3.84 2.31
0.2 Good E VS1 56.9 65.0 327 4.05 4.07 2.31
0.3 Premium I VS2 62.4 58.0 334 4.20 4.23 2.63
0.3 Good J SI2 63.3 58.0 335 4.34 4.35 2.75
0.2 Very Good J VVS2 62.8 57.0 336 3.94 3.96 2.48
Which means if a boxplot is plotted, R will initially sort the data according to cut. On the contrary, how about a dataset like:
cut price1 price2 price3
Good 0.68 0.89 0.74
Medium 0.12 0.35 0.26
Does this mean all of the values in each category is presorted? I wonder what method could be used to deal with this type of data to draw boxplot.
Upvotes: 1
Views: 61
Reputation: 28339
What you probably want to do is to "melt" your data (transform it from "wide" to "long" format). For example:
# Melt your dataset
library(reshape2)
# Here we melt dataset by "cut" (ie, we group by this column)
dataset_melt <- melt(dataset, "cut")
# How melted dataset looks like
# cut variable value
# 1: Good price1 0.68
# 2: Medium price1 0.12
# 3: Good price2 0.89
# 4: Medium price2 0.35
# 5: Good price3 0.74
# 6: Medium price3 0.26
# Plot melted dataset
library(ggplot2)
ggplot(dataset_melt, aes(cut, value)) +
geom_boxplot()
Upvotes: 2