mrpargeter
mrpargeter

Reputation: 329

R: na.locf not behaving as expected

I am trying to use the na.locf function in a mutate and I am getting a strange answer. The data is ordered desc by date and then if a column is NA gets the result from na.locf and otherwise uses the value in the column. For most of the data, the answer is being returned as expected, but one row is coming back not as the previous non-NA but as the next non-NA. If we order the data by date ascending and use na.rm = F and fromLast = T it works as expected, but I want to understand why the result is not working if date is ordered descending.

The example is as follows:

example = data.frame(Date = factor(c("1/14/15", "1/29/15", "2/3/15", 
    "2/11/15", "2/15/15", "3/4/15","3/7/15",  "3/7/15", "3/11/15", 
    "3/18/15", "3/21/15", "4/22/15", "4/22/15", "4/23/15", "5/6/15", 
    "5/13/15", "5/18/15", "5/24/15", "5/26/15", "5/28/15", "5/29/15", 
    "5/29/15", "6/25/15", "6/25/15","8/6/15",  "8/15/15", "8/20/15", 
    "8/22/15", "8/22/15", "8/29/15")),
   Scan = c(1, rep(NA, 21),2,rep(NA,7)),
   Hours = c(rep(NA,3), rep(3,3), NA, 2, rep(3,3), NA, 2, 3, 2, 
    rep(3,5), NA, 2, rep(c(NA, 3),2), 3, NA, 2, 3)
                   )
example %>% 
  mutate(
     date = as.Date(Date, "%m/%d/%y"),
     Hours = replace_na(Hours,0),
     scan_date = as.Date(ifelse(is.na(Scan), 
                            NA,
                            date),
                       origin="1970-01-01")) %>% 
  arrange(desc(date)) %>%
  mutate(
         scan_new = ifelse(is.na(Scan),
                na.locf(Scan), 
                Scan))

The issue in the result is in row 24, the Scan is coming in as 1 rather than 2:

      Date Scan Hours       date  scan_date scan_new
23  3/7/15   NA     0 2015-03-07       <NA>        2
24  3/7/15   NA     2 2015-03-07       <NA>        1
25  3/4/15   NA     3 2015-03-04       <NA>        2

Interestingly, other data with the same date is handled appropriately, for example on line 18-19

      Date Scan Hours       date  scan_date scan_new
18 4/22/15   NA     0 2015-04-22       <NA>        2
19 4/22/15   NA     2 2015-04-22       <NA>        2

For reference as noted above, the following provides the expected answer:

example %>% 
  mutate(
     date = as.Date(Date, "%m/%d/%y"),
     Hours = replace_na(Hours,0),
     scan_date = as.Date(ifelse(is.na(Scan), 
                            NA,
                            date),
                       origin="1970-01-01")) %>% 
  arrange(desc(date)) %>%
  mutate(
         scan_new = ifelse(is.na(Scan),
                na.locf(Scan, na.rm = F, fromLast = T), 
                Scan))

      Date Scan Hours       date  scan_date scan_new
6   3/4/15   NA     3 2015-03-04       <NA>        2
7   3/7/15   NA     0 2015-03-07       <NA>        2
8   3/7/15   NA     2 2015-03-07       <NA>        2

Can someone tell me why this is behaving this way?

Upvotes: 2

Views: 1164

Answers (1)

mt1022
mt1022

Reputation: 17289

In your first try na.locf(Scan), the leading NAs are removed and the remaining values are recycled to the full length in the ifelse. You can see the results with na.rm = F(or na.locf0, see comments) for reference:

example %>% 
    mutate(
        date = as.Date(Date, "%m/%d/%y"),
        Hours = replace_na(Hours,0),
        scan_date = as.Date(ifelse(is.na(Scan), 
            NA,
            date),
            origin="1970-01-01")) %>% 
    arrange(desc(date)) %>%
    mutate(
        scan_new = ifelse(is.na(Scan),
            na.locf(Scan, na.rm = FALSE), 
            Scan))

#       Date Scan Hours       date  scan_date scan_new
# 1  8/29/15   NA     3 2015-08-29       <NA>       NA
# 2  8/22/15   NA     0 2015-08-22       <NA>       NA
# 3  8/22/15   NA     2 2015-08-22       <NA>       NA
# 4  8/20/15   NA     3 2015-08-20       <NA>       NA
# 5  8/15/15   NA     3 2015-08-15       <NA>       NA
# 6   8/6/15   NA     0 2015-08-06       <NA>       NA
# 7  6/25/15    2     0 2015-06-25 2015-06-25        2
# 8  6/25/15   NA     3 2015-06-25       <NA>        2
# 9  5/29/15   NA     0 2015-05-29       <NA>        2
# 10 5/29/15   NA     2 2015-05-29       <NA>        2
# 11 5/28/15   NA     3 2015-05-28       <NA>        2
# 12 5/26/15   NA     3 2015-05-26       <NA>        2
# 13 5/24/15   NA     3 2015-05-24       <NA>        2
# 14 5/18/15   NA     3 2015-05-18       <NA>        2
# 15 5/13/15   NA     3 2015-05-13       <NA>        2
# 16  5/6/15   NA     2 2015-05-06       <NA>        2
# 17 4/23/15   NA     3 2015-04-23       <NA>        2
# 18 4/22/15   NA     0 2015-04-22       <NA>        2
# 19 4/22/15   NA     2 2015-04-22       <NA>        2
# 20 3/21/15   NA     3 2015-03-21       <NA>        2
# 21 3/18/15   NA     3 2015-03-18       <NA>        2
# 22 3/11/15   NA     3 2015-03-11       <NA>        2
# 23  3/7/15   NA     0 2015-03-07       <NA>        2
# 24  3/7/15   NA     2 2015-03-07       <NA>        2
# 25  3/4/15   NA     3 2015-03-04       <NA>        2
# 26 2/15/15   NA     3 2015-02-15       <NA>        2
# 27 2/11/15   NA     3 2015-02-11       <NA>        2
# 28  2/3/15   NA     0 2015-02-03       <NA>        2
# 29 1/29/15   NA     0 2015-01-29       <NA>        2
# 30 1/14/15    1     0 2015-01-14 2015-01-14        1

Upvotes: 2

Related Questions