hajoommah
hajoommah

Reputation: 27

Fill in text in adjacent blank variables till another text appears in R

I'm trying to paste texts over fill in empty variables till another text appears. I want to do this for a specific row

Current table:

var1 var2 var3 var3 var4
A textA textB
B 1 2 3 4
c 3 4 5 6

Desired output:

var1 var2 var3 var3 var4
A textA textA textB textB
B 1 2 3 4
c 3 4 5 6

What's an elegant way to do this? My current solution looks something like this but I'd like to use a logic instead of specifying a variable name like below:

mutate(var3=case_when(var1=="A" & is.na(var3) ~ var2))

Upvotes: 0

Views: 69

Answers (2)

akrun
akrun

Reputation: 887028

We may extract the row where 'var1' is "A", unlist and apply na.locf0 from zoo to replace the NA values with the previous non-NA value

library(zoo)
i1 <- df1$var1 == "A"
df1[i1,-1] <-  na.locf0(unlist(df1[i1,-1]))

-output

 df1
  var1  var2  var3 var3.1  var4
1    A textA textA  textB textB
2    B     1     2      3     4
3    c     3     4      5     6

Or with base R, create a numeric index based on the non-NA element (cumsum) and use the index to replicate the non-NA values from the extracted row

v1 <- unlist(df1[i1, -1])
df1[i1, -1] <- na.omit(v1)[cumsum(!is.na(v1))]

Or use tidyverse, to reshape to 'long' format (pivot_longer), apply fill to replace NA with previous non-NA and reshape back to wide with pivot_wider

library(dplyr)
library(tidyr)
df1 %>%
   pivot_longer(cols = -var1, values_transform = as.character) %>% 
   fill(value) %>% 
   pivot_wider(names_from = name, values_from = value)
# A tibble: 3 × 5
  var1  var2  var3  var3.1 var4 
  <chr> <chr> <chr> <chr>  <chr>
1 A     textA textA textB  textB
2 B     1     2     3      4    
3 c     3     4     5      6    

If there are only alternate NAs, an option is also

library(dplyover)
df1 %>% 
  mutate(across2(c(3, 5), c(2, 4), 
   ~ case_match(.x, NA ~ .y, .default = as.character(.x)),
    .names = "{xcol}"))

-output

  var1  var2  var3 var3.1  var4
1    A textA textA  textB textB
2    B     1     2      3     4
3    c     3     4      5     6

data

df1 <- structure(list(var1 = c("A", "B", "c"), var2 = c("textA", "1", 
"3"), var3 = c(NA, 2L, 4L), var3.1 = c("textB", "3", "5"), var4 = c(NA, 
4L, 6L)), class = "data.frame", row.names = c(NA, -3L))

Upvotes: 4

TarJae
TarJae

Reputation: 78917

Here is an option, but only for a few columns:

library(dplyr)
df %>% 
  mutate(var3 = ifelse(var3=="", var2, var3),
         var4 = ifelse(var4=="", var3.1, var4))
  var1  var2  var3  var3.1 var4 
  <chr> <chr> <chr> <chr>  <chr>
1 A     textA textA textB  textB
2 B     1     2     3      4    
3 c     3     4     5      6  

Upvotes: 1

Related Questions