Daniel
Daniel

Reputation: 1272

Defined interval in R by cut() and make a histogram plot

I am struggling to figure it out how to use cut() function to define interval of my data of interest by 12 months. I read this post R - Cut by Defined Interval. But it dose not help what I am looking for.

say, I have a set of data name months which have values less than a year <12 months till 50 months.

set.seed(50); sample(50) -> months

I want to use the cut() function to have the number of data which falls in each year including < 12 months.

> cut(months, breaks =  seq(12,50, by= 12))-> output
> output
 [1] (24,36] (12,24] <NA>    (36,48] (12,24] <NA>    (24,36] (24,36] <NA>    <NA>   
[11] (12,24] <NA>    (24,36] (36,48] (36,48] (36,48] (24,36] (12,24] (36,48] <NA>   
[21] (12,24] (36,48] (12,24] (12,24] <NA>    (12,24] (12,24] (24,36] <NA>    <NA>   
[31] (12,24] (36,48] (24,36] (36,48] <NA>    <NA>    (36,48] (12,24] (36,48] (24,36]
[41] (36,48] (12,24] (24,36] <NA>    <NA>    (24,36] <NA>    (24,36] (24,36] (36,48]
Levels: (12,24] (24,36] (36,48]

> table(output)
output
(12,24] (24,36] (36,48] 
     12      12      12

Questions

1- How I can get the number of data for < 12 months while I keep having the 12 months interval?

I tried this but dose not work!

> cut(months, breaks =  seq(1,12,50, by= 12))-> output

2- How I can make a hist() plot by this data?

Thanks,

Upvotes: 2

Views: 9490

Answers (2)

Joe
Joe

Reputation: 3806

geom_col() will provide you with a clearer histogram since the data are already in a frequency table.

library(dplyr)
library(ggplot2)

set.seed(50)
months <- sample(50)

output <- cut(months, breaks = seq(0,50, by= 12), labels = c("<12","12-24","24-35","36-50"))

table(output) %>% 
  as.data.frame() %>% 
  ggplot(aes(x = output, y = Freq)) + 
  geom_col()

enter image description here

Upvotes: 1

Patrick Williams
Patrick Williams

Reputation: 704

set.seed(50)
months <- sample(50)

output <- cut(months, breaks = seq(0,50, by= 12), labels = c("<12","12-24","24-35","36-50"))

hist(as.numeric(output))

You'll have to edit the axis values on the histogram manually, since they will be labeled at an interval 1-4. And as I mentioned in my comment. The histogram isn't very informative, considering all the values are equal.

Upvotes: 4

Related Questions