Reputation: 1
My data looks like
Year ... Growth_Rate
2011 NA
2012 2.0
2013 ... 3.2
2014 -2.0
2015 1.3
2016 ... 1.9
ggplot(ridership, aes(Year, Bronx$Growth_Rate, group=1, na.rm=TRUE))+
geom_bar(stat= "identity", aes(fill=Year)) +
scale_y_continuous("Ridership Growth Rate",
labels = percent_format())+ geom_point(col='black', size=0.7) +
geom_line(col='black', size=0.3) +
ggtitle("Ridership Change in Bronx") +
theme(plot.title = element_text(hjust = 0.5))
This is the graph. I would like to remove Y2011
Upvotes: 0
Views: 24327
Reputation: 2022
How about the following code.
Method 1
In here, I've preprocessed the missing values by removing them and storing the cleaned data in a separate data frame. Off course, you can save it in the same data frame like, dat<- na.omit(subset(dat, select = c(Year, Growth_Rate)))
`
# create some dummy data
Year<- c(2011:2016)
Growth_Rate<- c(NA,2.0,3.2,-2.0,1.3,1.9)
dat<- data.frame(Year, Growth_Rate, stringsAsFactors = FALSE)
# remove missing values
dat.clean<- na.omit(subset(dat, select = c(Year, Growth_Rate)))
# plot it
ggplot(data = dat, aes(Year,Growth_Rate))+
geom_bar(stat = "identity", na.rm = TRUE)+
geom_line(col='black', size=0.3)+
ggtitle("Ridership Change in Bronx") +
theme(plot.title = element_text(hjust = 0.5))
In my perspective, method 1 is easy, works as intended but adds an overhead of a temporary variable to hold the cleaned data.
Method 2
By using the coord_cartesian()
. Again in my opinion, the best use case for this method will be when, you wish to limit the x-axis values.
library(ggplot2)
# create some dummy data
Year<- c(2011:2016)
Growth_Rate<- c(NA,2.0,3.2,-2.0,1.3,1.9)
dat<- data.frame(Year, Growth_Rate, stringsAsFactors = FALSE)
# plot it
ggplot(data = dat, aes(Year,Growth_Rate))+
geom_bar(stat = "identity", na.rm = TRUE)+
geom_line(col='black', size=0.3)+
coord_cartesian(xlim = c(2012, 2016))+
ggtitle("Ridership Change in Bronx") +
theme(plot.title = element_text(hjust = 0.5))
The problem with method 2, is when executed it will generate warning message like, Warning messages: 1: Removed 1 rows containing missing values (position_stack). 2: Removed 1 rows containing missing values (geom_path).
Method 3
My grudge with Method 1
, it was creating an additional temporary variable to store the cleaned data. So I propose method 3;
ggplot(data = na.omit(subset(dat, select = c(Year, Growth_Rate))),
aes(Year,Growth_Rate))+
geom_bar(stat = "identity", na.rm = TRUE)+
geom_line(col='black', size=0.3)+
ggtitle("Ridership Change in Bronx") +
theme(plot.title = element_text(hjust = 0.5))
I think, Method 3
solves both my grudge
as well as the OP question
.
Upvotes: 2
Reputation: 3952
@Ashish answer is good if you only want to not plot NA
values.
However, you might want to clean your data and to reuse it later without the NA values. Here you go with some filtering using is.na
:
Year_No_NA <- Year[!is.na(Bronx$Growth_Rate)]
Growth_Rate_No_NA <- Bronx$Growth_Rate[!is.na(Bronx$Growth_Rate)]
Upvotes: 0