Reputation: 87
I asked this question to know how it is possible to plot many graphs in the same plot. Following to the answer which I liked and accepted, it is possible to use ggplot()
function.
Now using ggplot()
, I receive the following message which notifies that there are missing values were deleted for the plot:
Warning message:
Removed 33 row(s) containing missing values (geom_path).
From the produced plot and visualizing, I am satisfied with data after that ggplot()
removed the 33 rows.
I know how to delete rows of NA but here I don't understand if ggplot()
deleted rows where there exist NA for at least one variable OR removed rows where all variables are NA, knowing that I have 7 variables and there are some rows where all variables are completely NA while many rows contain NA for only some variables.
Question: Although the rows are already deleted for the plot, how it is possible to remove these rows "the detected 33 rows" completely from data?
Upvotes: 1
Views: 2662
Reputation: 56054
ggplot removes rows with NA for columns that are used as input aes to ggplot, if input is x and y columns, but dataframe has y column as well, it will only drop rows if x or y has NA.
Here is an example:
library(ggplot2)
x <- head(mtcars)
# add NA to some column we don't use for ggplot
x$am[ 1 ] <- NA
ggplot(x, aes(cyl, mpg)) + geom_point()
# no warnings
# now add NA to column that we use for plotting
x$cyl[ 1 ] <- NA
ggplot(x, aes(cyl, mpg)) + geom_point()
# Warning message:
# Removed 1 rows containing missing values (geom_point).
# to avoid that warning, we can explicitly set it to remove NA
ggplot(x, aes(cyl, mpg)) + geom_point(na.rm = TRUE)
# no warnings
To remove rows from the data, check if the selected columns have NA:
x_clean <- x[ !(is.na(x$cyl) | is.na(x$mpg)), ]
ggplot(x_clean , aes(cyl, mpg)) + geom_point()
# no warnings
Edit 1: To apply to your data based on comments, try below, see filter
:
Data <- bind_rows(...)
Data %>%
mutate(data = paste0('Data',data)) %>%
pivot_longer(-c(data,Time)) %>%
filter(!(is.na(Time) | is.na(value))) %>%
ggplot(aes(x = factor(Time), y =value), group = name, color = name))+
geom_line()+
facet_wrap(.~data,scales = 'free', ncol = 1) +
xlab('Time')
Edit 2: To "know" what data is going into ggplot why not keep filtered clean data as a separate object instead of piping, see:
Data <- bind_rows(...)
cleanData <- Data %>%
mutate(data = paste0('Data',data)) %>%
pivot_longer(-c(data,Time)) %>%
filter(!(is.na(Time) | is.na(value)))
ggplot(cleanData, aes(x = factor(Time), y =value), group = name, color = name)+
geom_line()+
facet_wrap(.~data,scales = 'free', ncol = 1) +
xlab('Time')
Upvotes: 1
Reputation: 5747
Those rows could have NA
values, or they could be out of bounds of the axis limits you set. ggplot()
generates the same warning in both cases. Here is an example of the latter.
This is the built-in mtcars
data set. Notice that there are no missing values:
mtcars
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
If I build the following plot, I get the ggplot warning about rows with missing values.
library(ggplot2)
ggplot(mtcars, aes(x = wt, y = qsec)) +
geom_point() +
scale_x_continuous(limits = c(2, 4)) +
scale_y_continuous(limits = c(16, 22))
Warning message:
Removed 14 rows containing missing values (geom_point).
The 14 rows with "missing values" are the 14 rows with data out of bounds of the axis limits. Here they are.
library(dplyr)
mtcars %>%
filter(wt < 2 | wt > 4 | qsec < 16 | qsec > 22)
mpg cyl disp hp drat wt qsec vs am gear carb
Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
Before attempting to remove "missing values" from your data, check to see if your plotting parameters exclude some of the data.
Upvotes: 0