DerDressing
DerDressing

Reputation: 315

R: Removed n rows containing missing values (geom_path)

I get a warning (Warning: Removed 2 rows containing missing values (geom_path).), which I don't want to have for the following code:

library(shiny)
library(ggplot2)
library(scales)


ui <- navbarPage("Test",
                 tabPanel("Test_2",
                          fluidPage(
                            fluidRow(
                              column(width = 12, plotOutput("plot", width = 1200, height = 600))
                            ),
                            fluidRow(
                              column(width = 12, sliderInput("slider",
                                                            label = "Range [h]",
                                                            min = as.POSIXct("2019-11-01 00:00"),
                                                            max = as.POSIXct("2019-11-01 07:00"),
                                                            value = c(as.POSIXct("2019-11-01 00:00"),as.POSIXct("2019-11-01 07:00"))))
                            ))))

server <- function(input, output, session) {

  df <- data.frame("x" = c(as.POSIXct("2019-11-01 00:00"),as.POSIXct("2019-11-01 01:00"),
                           as.POSIXct("2019-11-01 02:00"),as.POSIXct("2019-11-01 03:00"),
                           as.POSIXct("2019-11-01 04:00"),as.POSIXct("2019-11-01 05:00"),
                           as.POSIXct("2019-11-01 06:00"),as.POSIXct("2019-11-01 07:00")), 
                   "y" = c(0,1,2,3,4,5,6,7))

  observe({
    len_date_list <- length(df$x)

    min_merge_datetime <- df$x[1]
    max_merge_datetime <- df$x[len_date_list]

    updateSliderInput(session, "slider",
                      min = as.POSIXct(min_merge_datetime),
                      max = as.POSIXct(max_merge_datetime),
                      timeFormat = "%Y-%m-%d %H:%M")
  })

  output$plot <- renderPlot({

    in_slider_1 <- input$slider[1]
    in_slider_2 <- input$slider[2]

        ggplot(data=df, aes(x, y, group = 1)) +
        theme_bw() +
        geom_line(color="black", stat="identity") +
        # geom_point() +
        scale_x_datetime(labels = date_format("%m-%d %H:%M"),
                         limits = c(
                         as.POSIXct(in_slider_1),
                         as.POSIXct(in_slider_2)))
    })
}

shinyApp(server = server, ui = ui)

It seems to be an general problem with the "missing values", because I have found a lot of similar questions. In this question it is explained that it must be the range of the axis. So in my case I'm sure that it is because of the limits in scale_x_datetime.

    scale_x_datetime(labels = date_format("%m-%d %H:%M"),
                     limits = c(
                     as.POSIXct(in_slider_1),
                     as.POSIXct(in_slider_2)))

But I didn't found an answered question when scale_x_datetime, as.POSIXct and a slider is used.

BTW: If I comment out "geom_point" I get a further similar warning.

Upvotes: 0

Views: 4921

Answers (2)

DiegoJArg
DiegoJArg

Reputation: 183

I know this question already has an answer, but this is another possible solution for you.

If you just want to get rid of it, that implies to me that you are OK with the output. Then you can try the following:

  • Add na.rm=TRUE to geom_line like : geom_line(..., na.rm=TRUE )

This explicitly tells geom_line and geom_path that is OK to remove NA values.

Reasoning with the warning:

Warning of: Removed k rows containing missing values (geom_path)

This tells you mainly 3 things:

  • geom_path is being called by another geom_something which is firing the warning. In your case, is geom_line.
  • It already removed k rows. So if the output is as desired, then you want to those rows removed.
  • The reason for removal is that some values ARE missing (NA).

What the warning doesn't tells you is WHY those rows have missing (NA) values.

You know that the reason comes from scale_x_datetime. Mainly from the limits argument. In a sense of (X,Y) pairs to be drawn, you set the X scale to values where is no "Y", or Y=NA. Your scale may be continuous, but your data is not. You may want to set a larger scale for a different number of reasons, but ggplot will always find that there isn't an associated Y value, and it makes a unilateral decision and fires a warning instead of an error.

Hopefully, times will come when Errors and Warnings highlights intuitive, language-independent calling trace to the emitter and a link to a correctly explained site with common mistakes, etc.

Upvotes: 1

Eli Berkow
Eli Berkow

Reputation: 2725

I think it is because you haven't filtered df so when the limits of scale_x_datetime come along they remove the rows in df that don't fit between the slider parameters. I added this:

df %>% filter(between(x, in_slider_1, in_slider_2))

which seems to remove the issue for me. Please test. Just to mention that I did have some time zone problems.

Full code below:

library(shiny)
library(ggplot2)
library(scales)


ui <- navbarPage("Test",
                 tabPanel("Test_2",
                          fluidPage(
                            fluidRow(
                              column(width = 12, plotOutput("plot", width = 1200, height = 600))
                            ),
                            fluidRow(
                              column(width = 12, sliderInput("slider",
                                                            label = "Range [h]",
                                                            min = as.POSIXct("2019-11-01 00:00"),
                                                            max = as.POSIXct("2019-11-01 07:00"),
                                                            value = c(as.POSIXct("2019-11-01 00:00"),as.POSIXct("2019-11-01 07:00"))))
                            ))))

server <- function(input, output, session) {

  df <- data.frame("x" = c(as.POSIXct("2019-11-01 00:00"),as.POSIXct("2019-11-01 01:00"),
                           as.POSIXct("2019-11-01 02:00"),as.POSIXct("2019-11-01 03:00"),
                           as.POSIXct("2019-11-01 04:00"),as.POSIXct("2019-11-01 05:00"),
                           as.POSIXct("2019-11-01 06:00"),as.POSIXct("2019-11-01 07:00")), 
                   "y" = c(0,1,2,3,4,5,6,7))

  observe({
    len_date_list <- length(df$x)

    min_merge_datetime <- df$x[1]
    max_merge_datetime <- df$x[len_date_list]

    updateSliderInput(session, "slider",
                      min = as.POSIXct(min_merge_datetime),
                      max = as.POSIXct(max_merge_datetime),
                      timeFormat = "%Y-%m-%d %H:%M")
  })

  output$plot <- renderPlot({

    in_slider_1 <- input$slider[1]
    in_slider_2 <- input$slider[2]

        ggplot(data=df %>% filter(between(x, in_slider_1, in_slider_2)), aes(x, y, group = 1)) +
        theme_bw() +
        geom_line(color="black", stat="identity") +
        # geom_point() +
        scale_x_datetime(labels = date_format("%m-%d %H:%M"),
                         limits = c(
                         as.POSIXct(in_slider_1),
                         as.POSIXct(in_slider_2)))
    })
}

shinyApp(server = server, ui = ui)

It looks like you could now actually remove the scale_x_datetime completely and just have:

        ggplot(data=df %>% filter(between(x, in_slider_1, in_slider_2)), aes(x, y, group = 1)) +
        theme_bw() +
        geom_line(color="black", stat="identity")

Upvotes: 2

Related Questions