Reputation: 1
I'm new to coding (Particularly R) and wanted to know what the differences between
Breaks =
vs.
Bins()
are and in what scenarios you would use one over the other.
Thanks in advance for the clarification!
Upvotes: 0
Views: 103
Reputation: 3228
If this is in relation to something like histograms in ggplot2
, the bins
arguments automatically stack your data into a set number of columns, whereas the breaks
arguments specify where exactly that is. As an example, we can look at these two plots:
#### Automatically Separates into Bins ####
iris %>%
ggplot(aes(x=Sepal.Length))+
geom_histogram(bins = 10)
#### Manually Inserts Breaks at Designated Spots ####
iris %>%
ggplot(aes(x=Sepal.Length))+
geom_histogram(breaks=c(1,2,3,4,5,
6,7,8,9,10))
The first automatically got assigned 10 bins (columns) like below:
Since the data deals with decimal values and is bounded between 4.3 and 7.9, the second manual 10 breaks at numbers 1 to 10 (explicitly I'm saying "I want Sepal Length 1 to 10") doesn't end up looking the same:
If I want to set it at much more precise locations, I can do this instead with the breaks
argument:
iris %>%
ggplot(aes(x=Sepal.Length))+
geom_histogram(breaks=c(4.0,
4.3,
5.0,
5.3,
6.0,
6.3,
7.0,
7.3,
8.0))
Upvotes: 1