stakowerflol
stakowerflol

Reputation: 1079

Map One data.frame to another

Given df1 which is a subset of df2 (less number of cols):

df1 <- data.frame(Species = letters[1:10])
df2 <- iris

I want to map df1 to have equal number of columns as df2 with the same colnames. My solution:

mapDf <- function(df, dfToMap) {

  result <- data.frame(matrix(ncol = ncol(dfToMap), nrow = nrow(df)))
  colnames(result) <- colnames(dfToMap)

  for(c in colnames(dfToMap)) {
    if(c %in% colnames(df)) {
      result[, c] <- df[, c]
    }
  }

  result
}

test:

mapDf(df1, df2)

Any ideas how to simplify it?

Upvotes: 1

Views: 25

Answers (2)

Frank
Frank

Reputation: 66819

With data.table....

library(data.table)
res = setDT(df2[NA_integer_, ])[df1, on=names(df1)]
setcolorder(res, names(df2))

    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
 1:           NA          NA           NA          NA       a
 2:           NA          NA           NA          NA       b
 3:           NA          NA           NA          NA       c
 4:           NA          NA           NA          NA       d
 5:           NA          NA           NA          NA       e
 6:           NA          NA           NA          NA       f
 7:           NA          NA           NA          NA       g
 8:           NA          NA           NA          NA       h
 9:           NA          NA           NA          NA       i
10:           NA          NA           NA          NA       j

(I'm not sure if the setcolorder is necessary or if res will already have the same col order as df2.)

The idea for a "missing slice" of an object x[NA_integer_,] or an "empty slice" x[0L,] can also be found in the vetr package, where the latter is referred to as a "template". (I'm not sure whether that'll be useful for OP's use-case or not.)

Upvotes: 3

Ronak Shah
Ronak Shah

Reputation: 388982

You could simplify it with

map_data_frames <- function(df1, df2, fill = NA) {
   cols <- colnames(df2) %in% colnames(df1)
   df1[names(df2[!cols])] <- fill
   df1[, colnames(df2)] #From @zx8754 in comments      
}

map_data_frames(df1, df2)

#   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#1            NA          NA           NA          NA       a
#2            NA          NA           NA          NA       b
#3            NA          NA           NA          NA       c
#4            NA          NA           NA          NA       d
#5            NA          NA           NA          NA       e
#6            NA          NA           NA          NA       f
#7            NA          NA           NA          NA       g
#8            NA          NA           NA          NA       h
#9            NA          NA           NA          NA       i
#10           NA          NA           NA          NA       j

Upvotes: 3

Related Questions