Reputation: 491
Very basic question here as I'm just starting to use R, but I'm trying to create a bar plot of factor counts in ggplot2 and when plotting, get 14 little colored blips representing my actual levels and then a massive grey bar at the end representing the 5000-ish NAs in the sample (it's survey data from a question that only applies to about 5% of the sample). I've tried the following code to no avail:
ggplot(data = MyData,aes(x= the_variable, fill=the_variable, na.rm = TRUE)) +
geom_bar(stat="bin")
The addition of the na.rm argument here has no apparent effect.
meanwhile
ggplot(data = na.omit(MyData),aes(x= the_variable, fill=the_variable, na.rm = TRUE)) +
geom_bar(stat="bin")
gives me
"Error: Aesthetics must either be length one, or the same length as the data"
as does affixing the na.omit()
to the_variable, or both MyData and the_variable.
All I want to do is eliminate the giant NA bar from my graph, can someone please help me do this?
Upvotes: 49
Views: 257537
Reputation: 41265
Another option is using the function complete.cases
like this:
library(ggplot2)
# With NA
ggplot(airquality, aes(x = Ozone))+
geom_bar(stat="bin")
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> Warning: Removed 37 rows containing non-finite values (stat_bin).
# Remove NA using complete.cases
airquality_complete=airquality[complete.cases(airquality), ]
ggplot(airquality_complete, aes(x = Ozone))+
geom_bar(stat="bin")
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Created on 2022-08-25 with reprex v2.0.2
Upvotes: 0
Reputation: 950
Try remove_missing
instead with vars = the_variable
. It is very important that you set the vars
argument, otherwise remove_missing
will remove all rows that contain an NA
in any column!! Setting na.rm = TRUE
will suppress the warning message.
ggplot(data = remove_missing(MyData, na.rm = TRUE, vars = the_variable),aes(x= the_variable, fill=the_variable, na.rm = TRUE)) +
geom_bar(stat="bin")
Upvotes: 12
Reputation: 626
Additionally, adding na.rm= TRUE to your geom_bar() will work.
ggplot(data = MyData,aes(x= the_variable, fill=the_variable, na.rm = TRUE)) +
geom_bar(stat="bin", na.rm = TRUE)
I ran into this issue with a loop in a time series and this fixed it. The missing data is removed and the results are otherwise uneffected.
Upvotes: 27
Reputation: 13807
You can use the function subset
inside ggplot2
. Try this
library(ggplot2)
data("iris")
iris$Sepal.Length[5:10] <- NA # create some NAs for this example
ggplot(data=subset(iris, !is.na(Sepal.Length)), aes(x=Sepal.Length)) +
geom_bar(stat="bin")
Upvotes: 61
Reputation: 121
Not sure if you have solved the problem. For this issue, you can use the "filter" function in the dplyr package. The idea is to filter the observations/rows whose values of the variable of your interest is not NA. Next, you make the graph with these filtered observations. You can find my codes below, and note that all the name of the data frame and variable is copied from the prompt of your question. Also, I assume you know the pipe operators.
library(tidyverse)
MyDate %>%
filter(!is.na(the_variable)) %>%
ggplot(aes(x= the_variable, fill=the_variable)) +
geom_bar(stat="bin")
You should be able to remove the annoying NAs on your plot. Hope this works :)
Upvotes: 12
Reputation: 3111
Just an update to the answer of @rafa.pereira.
Since ggplot2
is part of tidyverse
, it makes sense to use the convenient tidyverse functions to get rid of NAs.
library(tidyverse)
airquality %>%
drop_na(Ozone) %>%
ggplot(aes(x = Ozone))+
geom_bar(stat="bin")
Note that you can also use drop_na()
without columns specification; then all the rows with NAs in any column will be removed.
Upvotes: 33
Reputation: 1
From my point of view this error "Error: Aesthetics must either be length one, or the same length as the data" refers to the argument aes(x,y) I tried the na.omit() and worked just fine to me.
Upvotes: 0