user41509
user41509

Reputation: 1040

fill in NAs in dataframe between values

I have an example dataset

newdata<-data.frame(Tow.y=c(21,"NA","NA","NA","NA","NA",22,"NA","NA","NA","NA","NA",23,"NA","NA"),Tow=c("NA","NA","NA",21,"NA","NA","NA","NA",22,"NA","NA","NA","NA","NA",23))
newdata$Tow.y<-as.numeric(as.character(newdata$Tow.y))
newdata$Tow<-as.numeric(as.character(newdata$Tow))

newdata1<-newdata %>% 
  mutate(Station = coalesce(Tow.y, Tow))
newdata1

The resulting code produces:

Tow.y Tow Station
1     21  NA      21
2     NA  NA      NA
3     NA  NA      NA
4     NA  21      21
5     NA  NA      NA
6     NA  NA      NA
7     22  NA      22
8     NA  NA      NA
9     NA  22      22
10    NA  NA      NA
11    NA  NA      NA
12    NA  NA      NA
13    23  NA      23
14    NA  NA      NA
15    NA  23      23

I would like to fill in NAs for NAs between unique values in Station. So NAs in between the two 21 values would be 21, the NAs in between the 22s would be 22, etc. The NAs in between consecutive numbers would remain NAs.

Like this:

  Tow.y Tow Station
1     21  NA      21
2     NA  NA      21
3     NA  NA      21
4     NA  21      21
5     NA  NA      NA
6     NA  NA      NA
7     22  NA      22
8     NA  NA      22
9     NA  22      22
10    NA  NA      NA
11    NA  NA      NA
12    NA  NA      NA
13    23  NA      23
14    NA  NA      23
15    NA  23      23

I have tried the na.locf function in the zoo package, but that replaces all NA values.

newdata1$Station2<-na.locf(newdata1$Station,na.rm = F)

Other examples I have looked at show that you can use na.locf with a group variable, but I dont have a grouping variable that is complete for the data set. Does anyone have a method for filling in the NAs where I need them to be filled in.

Upvotes: 2

Views: 145

Answers (2)

akrun
akrun

Reputation: 886948

Based on the example, it seems that the 'Tow' and 'Tow.y' values match in a 'start', 'end' way. In that case, we can use base R methods.

Create a sequence index ('i1') to replicate the non-NA elements in 'Tow' (or 'Tow.y') for the 'Station' column. The 'lst' returns a list of numeric index, which is used to assign the values to 'Station'

lst <- do.call(Map, c(f = seq, unname(lapply(newdata,
          function(x) seq_along(x)[!is.na(x)]))))
i1 <- unlist(lst)    
newdata$Station[i1] <- rep(na.omit(newdata$Tow), lengths(lst))
newdata
#   Tow.y Tow Station
#1     21  NA      21
#2     NA  NA      21
#3     NA  NA      21
#4     NA  21      21
#5     NA  NA      NA
#6     NA  NA      NA
#7     22  NA      22
#8     NA  NA      22
#9     NA  22      22
#10    NA  NA      NA
#11    NA  NA      NA
#12    NA  NA      NA
#13    23  NA      23
#14    NA  NA      23
#15    NA  23      23

Or using the same logic with tidyverse

library(tidyverse)
newdata %>% 
    mutate_all(funs(row_number() * !is.na(.))) %>% 
    map( ~ .x[.x!=0]) %>% 
               transpose %>%
               map(reduce, `:`) %>% 
    set_names(na.omit(newdata$Tow)) %>%
    stack %>%
    right_join(newdata %>% mutate(values = row_number())) %>% 
    rename(Station = ind) %>%
    ungroup %>%

    select(names(newdata), everything(), -values)
#   Tow.y Tow Station
#1     21  NA      21
#2     NA  NA      21
#3     NA  NA      21
#4     NA  21      21
#5     NA  NA    <NA>
#6     NA  NA    <NA>
#7     22  NA      22
#8     NA  NA      22
#9     NA  22      22
#10    NA  NA    <NA>
#11    NA  NA    <NA>
#12    NA  NA    <NA>
#13    23  NA      23
#14    NA  NA      23
#15    NA  23      23

Upvotes: 1

Gregor Thomas
Gregor Thomas

Reputation: 145755

Here's a good way. I left the helper columns in to demonstrate how it works, but you can easily remove them with a select.

newdata1 %>%
  mutate(from_first = zoo::na.locf(Station, na.rm = FALSE),
         from_last = zoo::na.locf(Station, na.rm = FALSE, fromLast = TRUE),
         result = if_else(from_first == from_last, from_first, Station))
#    Tow.y Tow Station from_first from_last result
# 1     21  NA      21         21        21     21
# 2     NA  NA      NA         21        21     21
# 3     NA  NA      NA         21        21     21
# 4     NA  21      21         21        21     21
# 5     NA  NA      NA         21        22     NA
# 6     NA  NA      NA         21        22     NA
# 7     22  NA      22         22        22     22
# 8     NA  NA      NA         22        22     22
# 9     NA  22      22         22        22     22
# 10    NA  NA      NA         22        23     NA
# 11    NA  NA      NA         22        23     NA
# 12    NA  NA      NA         22        23     NA
# 13    23  NA      23         23        23     23
# 14    NA  NA      NA         23        23     23
# 15    NA  23      23         23        23     23

Upvotes: 4

Related Questions