Reputation: 217

Automated fill in columns in r

I have a dataframe (shown below) where there are some asterisks in the "sig" column.

I want to fill in asterisks in the empty cells in the sig column everywhere above the furthest down row where there is an asterisk, which in this case would be everywhere from row "H" up to get something like this:

I'm thinking some sort of a for loop where it identifies the furthest down row where there is an asterisk and then fills in asterisks in empty cells above might be the way to go, but I'm not sure how to code this.

For debugging purposes, I make the data frame in R with

df<- data.frame("variable"= c("a","b","c","d","e","f","g","h","i","j","k"),
                "value" = c(0.04,0.03,0.04,0.02,0.03,0.02,0.02,0.01,0.04,0.1,0.02), 
                "sig" = c("*","*","*","","*","*","","*","","",""))

Any help would be greatly appreciated - thanks!

Upvotes: 0

Answers (3)

Vinícius Félix

Reputation: 8836

Another solution would be using fill, but you need to change "" to NA

Libraries

library(tidyverse)

Data

df <-
  data.frame("variable"= c("a","b","c","d","e","f","g","h","i","j","k"),
             "value" = c(0.04,0.03,0.04,0.02,0.03,0.02,0.02,0.01,0.04,0.1,0.02), 
             "sig" = c("*","*","*","","*","*","","*","","",""))

Code

df %>% 
  mutate(sig = if_else(sig == "",NA_character_,sig)) %>% 
  fill(sig,.direction = "up")

Output

   variable value  sig
1         a  0.04    *
2         b  0.03    *
3         c  0.04    *
4         d  0.02    *
5         e  0.03    *
6         f  0.02    *
7         g  0.02    *
8         h  0.01    *
9         i  0.04 <NA>
10        j  0.10 <NA>
11        k  0.02 <NA>

Upvotes: 2

cazman

Reputation: 1492

Another way:

df[1:max(which(df$sig == "*")), "sig"] = "*"

Gives:

   variable value sig
1         a  0.04   *
2         b  0.03   *
3         c  0.04   *
4         d  0.02   *
5         e  0.03   *
6         f  0.02   *
7         g  0.02   *
8         h  0.01   *
9         i  0.04    
10        j  0.10    
11        k  0.02

Upvotes: 3

akrun

Reputation: 887971

We could use replace based on finding the index of the last element having *

library(dplyr)
df <- df %>%
    mutate(sig = replace(sig, seq(tail(which(sig == "*"), 1)), "*"))

-output

df
   variable value sig
1         a  0.04   *
2         b  0.03   *
3         c  0.04   *
4         d  0.02   *
5         e  0.03   *
6         f  0.02   *
7         g  0.02   *
8         h  0.01   *
9         i  0.04    
10        j  0.10    
11        k  0.02