Reputation: 105
Suppose that we have a dataframe in which one of the columns represents a list of numerical data entries.
"ID","Costs"
"tim","1, 2, 3, 4, 5, 6, 7, 8"
"ryan","8, 7, 6, 5, 4, 3, 2, 1"
"bob","1, 3, 5, 7, 9, 11, 13, 15"
If I wanted to construct a box-plot of costs with respect to ID, how would approach doing so?
Upvotes: 0
Views: 67
Reputation: 93813
A base R solution is pretty much a one-liner, since boxplot()
will accept a list
as input:
boxplot(lapply(strsplit(dat$Costs, ",\\s+"), as.numeric), names=dat$ID)
dat
in this case being:
dat <- structure(list(ID = c("tim", "ryan", "bob"), Costs = c("1, 2, 3, 4, 5, 6, 7, 8",
"8, 7, 6, 5, 4, 3, 2, 1", "1, 3, 5, 7, 9, 11, 13, 15")), .Names = c("ID",
"Costs"), class = "data.frame", row.names = c(NA, -3L))
Upvotes: 2
Reputation: 263352
If you want a base solution, here's one possibility:
boxplot( values~ind,
data=stack( data.frame( apply(df1, 1, # stack function converts wide to long
function(r) setNames(
list(scan(text=r[2], sep=",")), # numeric Costs
r[1]) ) )) ) # names then as 'ID'
Upvotes: 1
Reputation: 33782
Assuming that the data are as given in your example, i.e. column Costs
contains quoted characters separated by comma + space:
df1 <- read.csv(text = '"ID","Costs"
"tim","1, 2, 3, 4, 5, 6, 7, 8"
"ryan","8, 7, 6, 5, 4, 3, 2, 1"
"bob","1, 3, 5, 7, 9, 11, 13, 15"',
header = TRUE,
stringsAsFactors = FALSE)
Then you can separate the values using unnest
, convert to numeric and plot:
library(tidyverse)
df1 %>%
unnest(Costs = str_split(Costs, ", ")) %>%
mutate(Costs = as.numeric(Costs)) %>%
ggplot(aes(ID, Costs)) +
geom_boxplot()
Upvotes: 2