Reputation: 4725
I have this plot from the provided test code:
I would like for the x ticks to be organised sensibly (seeing the image further on of the plot created using the original data highlights the problem).
Here is some code that can be used as an example :
## Create some numbers for testing
set.seed(123)
Aboard <- sample(1:50,50)
## some years to use
Years <- c(1931, 1931, 1931, 1934, 1934, 1934, 1934, 1937, 1937, 1937, 1937, 1937, 1938, 1943, 1943, 1943, 1943, 1943, 1955, 1955, 1955, 1955, 1955, 1961, 1961, 1961, 1970, 1970, 1970, 1970, 1973, 1973, 1973, 1978, 1980, 1980, 1982, 1982, 1983, 1984, 1984, 1985, 1986, 1986, 1986, 1987, 1987, 1989, 1990, 1990)
df <- data.frame(Aboard, Years)
###############################################################################
## I WANT TO FIND THE SUM OF FOR EACH YEAR
## change years to factor variable, so that I have levels to work with.
df$Years <- factor(df$Years)
## blank vector to store sum values.
aboardYearTotal= c()
## iterate over the levels of the years vector.
for(y in levels(as.factor(df$Years))){
## I want to use an integer rather than a string
y = as.numeric(y)
## for each level - find the sum of all Aboard values that correspond with it.
## I need to remove NA values as there are some.
yy=sum(df$Aboard[df$Years==y], na.rm = TRUE)
aboardYearTotal = c(aboardYearTotal, yy)
}
## I no longer need y, or yy
rm(y)
rm(yy)
###############################################################################
## Create plot using this variable
yearLevels <- levels(as.factor(df$Years))
aboardYears <- data.frame(yearLevels, aboardYearTotal)
## Create a plot of the data for total number aboard each year
p <- ggplot(aboardYears, aes(yearLevels, aboardYearTotal))
p + geom_point(aes(size = aboardYearTotal))
How can I control the ticks on the x axis here?
I've tried to play around with scale_x_continuous
and scale_x_discrete
but I
can't get it to work as intended.
For example if my start value was 0 and end value was 10, with a spacing of 2, I would have the x axis marked as:
0 2 4 6 8 10
Here is the original plot which highlights the problem I'm having with the x axis :
I'm open to suggestions or advice for better practices general.
Upvotes: 2
Views: 93
Reputation: 93851
Don't convert Year
to a factor. Instead, leave it numeric and use stat_summary
to take care of the sum.
df <- data.frame(Aboard, Years)
ggplot(df, aes(Years, Aboard)) +
stat_summary(fun.y=sum, geom="point", aes(size=..y..))
ggplot
will pick sensible defaults for the x-axis labels, but you can change these as well. For example:
ggplot(df, aes(Years, Aboard)) +
stat_summary(fun.y=sum, geom="point", aes(size=..y..)) +
scale_x_continuous(breaks=seq(1920, 2020, 20))
You can set the x-axis breaks to be at whatever values you want by providing a vector of those values. For example:
scale_x_continuous(breaks=seq(min(df$Years), max(df$Years)+6, 6))
or
scale_x_continuous(breaks=c(1931, 1955))
Sometimes, you'll need or want to perform data summary operations outside of ggplot. There are a number of options. Here are a couple:
Base R
df.summary = aggregate(Aboard ~ Years, df, sum)
tidyverse
library(tidyverse)
df.summary = df %>%
group_by(Years) %>%
summarise(Aboard = sum(Aboard))
You can even do this on the fly when you plot the data, without the need to create a separate summary data frame. For example:
ggplot(aggregate(Aboard ~ Years, df, sum), aes(Years, Aboard, size=Aboard)) +
geom_point()
or
df %>%
group_by(Years) %>%
summarise(Aboard = sum(Aboard)) %>%
ggplot(aes(Years, Aboard, size=Aboard)) +
geom_point()
Upvotes: 3