Remi.b
Remi.b

Reputation: 18219

ggplot, count of the number of occurrences of a categorical variable and splitting this count according to a continuous variable

I'm trying hard to achieve something with ggplot but I keep failing !...

Here is a data.table

set.seed(12) 
data=data.table(categories=c('c','a','b','a','a','c','b','b','a','c','c','b'),hello=runif(12,0,15)
reclassification = c(0,4,7,15)

I'm trying to do the following plot:

y-axis : The 3 categories (a,b,c)

x-axis : A count of the number of times each category is found

colour/shape : The column "hello" reclassified according the vector "reclassification". There should have 3 colours in my example. One for the count of "categories" for 0 < hello < 4, one for 4 < hello < 7, one for 7 < hello < 15

Note: This plot can be made of bars, lines, volumes, several different plots, etc... (I would actually appreciate trying some different solutions)

Upvotes: 1

Views: 13924

Answers (2)

jbaums
jbaums

Reputation: 27388

I think you have some redundant information in this plot, because your x-axis gives the frequency of points in each category, yet you still need to plot all the points in order to display their reclassified values for hello. But then again, I'm not sure I fully understand how you want the colours applied.

You could do something along these lines:

library(data.table)
library(ggplot2)

set.seed(12)
# I've increased the number of categories here to provide a fuller example.
data <- data.table(categories=sample(letters[1:10], 50, replace=T), 
                   hello=runif(50, 0, 15)) 
reclassification = c(0, 4, 7, 15)

p <- ggplot(data, 
            aes(table(categories)[match(categories, names(table(categories)))], 
                categories, 
                col = cut(hello, reclassification)))

p + geom_jitter(position = position_jitter(width = 0.15, height=0.15),
                shape=20, size=4) + 
  labs(list(x='Frequency', y='Category', col='Class')) +
  scale_colour_manual(values = c('#404040', '#CA0020', '#2B83BA'))

enter image description here

Upvotes: 2

Roland
Roland

Reputation: 132706

library(data.table)
set.seed(12) 

DT <- data.table(categories=c('c','a','b','a','a','c','b','b','a','c','c','b'),hello=runif(12,0,15))
reclassification <- c(0,4,7,15)
DT[,colour:=cut(hello,c(-Inf,reclassification,Inf))]

library(ggplot2)
p <- ggplot(DT,aes(x=categories,fill=colour)) + geom_bar()
print(p)

enter image description here

Upvotes: 4

Related Questions