Vedda
Vedda

Reputation: 7435

Fill in missing data that is same across columns

I need to fill in some missing data from a merge that is the same in all columns. After the merge, all the values are NA, but i would like a quick way to fill them in since their values are the same.

Example:

df <- structure(list(date = structure(c(-25932, -25931, -25930, -25929, 
-25928), class = "Date"), year = c(1899, 1899, 1899, 1899, 1899
), month = c(1, 1, 1, 1, 1), day = c(1, 2, 3, 4, 5), test1 = c(NA, 
NA, "VAR1", NA, NA), test2 = c(NA, NA, "VAR2", NA, NA), test3 = c(NA, 
NA, "VAR3", NA, NA)), .Names = c("date", "year", "month", "day", 
"test1", "test2", "test3"), row.names = c(NA, 5L), class = "data.frame")

# Tedious way, but works
df$test1 <- "VAR1"

# Desired output
    date     year month day test1 test2 test3
1 1899-01-01 1899     1   1  VAR1  VAR2  VAR3
2 1899-01-02 1899     1   2  VAR1  VAR2  VAR3
3 1899-01-03 1899     1   3  VAR1  VAR2  VAR3
4 1899-01-04 1899     1   4  VAR1  VAR2  VAR3
5 1899-01-05 1899     1   5  VAR1  VAR2  VAR3

Upvotes: 0

Views: 66

Answers (2)

akrun
akrun

Reputation: 887118

We can also use data.table. Convert the 'data.frame' to 'data.table' (setDT(df)). Based on the index of 'test' columns ('nm1'), we loop with for and set the NA elements by the non-NA elements in each column.

library(data.table)
nm1 <- grep('^test', names(df))
setDT(df)
for(j in nm1){
  set(df, i=which(is.na(df[[j]])), j=j, value= na.omit(df[[j]]))
 }
df
#          date year month day test1 test2 test3
#1: 1899-01-01 1899     1   1  VAR1  VAR2  VAR3
#2: 1899-01-02 1899     1   2  VAR1  VAR2  VAR3
#3: 1899-01-03 1899     1   3  VAR1  VAR2  VAR3
#4: 1899-01-04 1899     1   4  VAR1  VAR2  VAR3
#5: 1899-01-05 1899     1   5  VAR1  VAR2  VAR3

Upvotes: 2

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193517

You can try something like the following:

df
#         date year month day test1 test2 test3
# 1 1899-01-01 1899     1   1  <NA>  <NA>  <NA>
# 2 1899-01-02 1899     1   2  <NA>  <NA>  <NA>
# 3 1899-01-03 1899     1   3  VAR1  VAR2  VAR3
# 4 1899-01-04 1899     1   4  <NA>  <NA>  <NA>
# 5 1899-01-05 1899     1   5  <NA>  <NA>  <NA>

df[grep("test", names(df))] <- lapply(df[grep("test", names(df))], 
                                      function(x) x[!is.na(x)][1])
df
#         date year month day test1 test2 test3
# 1 1899-01-01 1899     1   1  VAR1  VAR2  VAR3
# 2 1899-01-02 1899     1   2  VAR1  VAR2  VAR3
# 3 1899-01-03 1899     1   3  VAR1  VAR2  VAR3
# 4 1899-01-04 1899     1   4  VAR1  VAR2  VAR3
# 5 1899-01-05 1899     1   5  VAR1  VAR2  VAR3

Upvotes: 2

Related Questions