shivani gupta
shivani gupta

Reputation: 91

Replace missing values in one table to values in another table by joining column name to row

I am trying to replace missing values in one table to values in another table by joining column name to row. Below is an example:

df1

A  B  C  D
1  3  4  *
4  *  5  9
0  *  2  *
1  2  *  4

df2

Variable  Value
A  2
B  1
C  9
D  0

Result data set:

A  B  C  D
1  3  4  0
4  1  5  9
0  1  2  0
1  2  9  4

Upvotes: 2

Views: 663

Answers (4)

zx8754
zx8754

Reputation: 56004

We can create lookup table then update on match:

# make a lookup table same size as df1
df2Lookup <-
  matrix(rep(df2$Value, nrow(df1)), nrow = nrow(df1), byrow = TRUE)

# then update on "*"
df1[ df1 == "*" ] <- df2Lookup[ df1 == "*" ]

#result
df1
#   A B C D
# 1 1 3 4 0
# 2 4 1 5 9
# 3 0 1 2 0
# 4 1 2 9 4

Upvotes: 1

Sotos
Sotos

Reputation: 51582

Another option using stack and unstack,

d1 <- stack(df)
d1$values[d1$values == '*'] <- df1$Value[match(d1$ind, df1$Variable)][d1$values == '*']
unstack(d1, values ~ ind)
#  A B C D
#1 1 3 4 0
#2 4 1 5 9
#3 0 1 2 0
#4 1 2 9 4

DATA

dput(df)
structure(list(A = c(1, 4, 0, 1), B = c("3", "*", "*", "2"), 
    C = c("4", "5", "2", "*"), D = c("*", "9", "*", "4")), .Names = c("A", 
"B", "C", "D"), row.names = c(NA, -4L), class = "data.frame")

dput(df1)
structure(list(Variable = c("A", "B", "C", "D"), Value = c(2L, 
1L, 9L, 0L)), .Names = c("Variable", "Value"), row.names = c(NA, 
-4L), class = "data.frame")

Upvotes: 4

akrun
akrun

Reputation: 886948

We can use Map

df1[as.character(df2$Variable)] <- Map(function(x, y)
    replace(x, is.na(x), y), df1[as.character(df2$Variable)], df2$Value)

If the values are not NA and just * then

df1[as.character(df2$Variable)] <- Map(function(x, y)
    replace(x, x=="*", y), df1[as.character(df2$Variable)], df2$Value)
df1
#  A B C D
#1 1 3 4 0
#2 4 1 5 9
#3 0 1 2 0
#4 1 2 9 4

If the dataset 'df1' is not character, then do

df1[] <- as.matrix(df1)

data

df1 <- structure(list(A = c(1L, 4L, 0L, 1L), B = c("3", "*", "*", "2"
 ), C = c("4", "5", "2", "*"), D = c("*", "9", "*", "4")), .Names = c("A", 
 "B", "C", "D"), class = "data.frame", row.names = c(NA, -4L))
df2 <- structure(list(Variable = c("A", "B", "C", "D"), Value = c(2L, 
 1L, 9L, 0L)), .Names = c("Variable", "Value"), class = "data.frame",
  row.names = c(NA, -4L))

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 388817

Finding out the column names of "*" and matching that with Variable column in df2 and extracting the corresponding Value.

x <- which(df1=="*", arr.ind = TRUE)
df1[x] <- df2$Value[match(names(df1)[x[, 2]], df2$Variable)]

#  A B C D
#1 1 3 4 0
#2 4 1 5 9
#3 0 1 2 0
#4 1 2 9 4

This is assuming you have character columns in df1, if they are not convert them by

df1[] <- lapply(df1, as.character)

Upvotes: 2

Related Questions