Reputation: 5169
I have the following data frame:
library(dplyr)
library(tibble)
df <- tibble(
source = c("a", "b", "b"),
target = c("a", "b", "c"),
score = c(10, 5, 3) ) %>%
pivot_wider(names_from = target, values_from = score) %>%
as.data.frame() %>%
column_to_rownames(var = "source")
df
It looks like this
a b c
a 10 NA NA
b NA 5 3
Now the column and row names are incomplete. The complete names is given in this vector:
master_source_target <- c("a", "b","c","d")
I want to complete in the column and row of the data frame based on that vector so that the end it will look like this:
a b c d
a 10 NA NA NA
b NA 5 3 NA
c NA NA NA NA
d NA NA NA NA
How can I achieve that?
Upvotes: 1
Views: 269
Reputation: 886938
It is easier with a matrix
of NA
s and then do the assignment based on the dimnames of 'df'
new <- matrix(NA, length(master_source_target),
length(master_source_target),
dimnames = list(master_source_target, master_source_target))
new[row.names(df), colnames(df)] <- unlist(df)
new
# a b c d
#a 10 NA NA NA
#b NA 5 3 NA
#c NA NA NA NA
#d NA NA NA NA
As the OP used tibble
, we can also make use of tidyverse approachess
library(tibble)
library(dplyr)
library(tidyr)
df %>%
rownames_to_column('rn') %>%
pivot_longer(cols = -rn, values_drop_na = TRUE) %>%
complete(rn = master_source_target, name = master_source_target) %>%
pivot_wider(names_from = name, values_from = value) %>%
column_to_rownames('rn')
Or with crossing
and a join
crossing(Var1 = master_source_target, Var2 = master_source_target) %>%
left_join(as.data.frame.table(as.matrix(df))) %>%
pivot_wider(names_from = Var2, values_from = Freq) %>%
column_to_rownames('Var1')
Upvotes: 1
Reputation: 34291
You can do it in a single step:
df[setdiff(master_source_target, rownames(df)), master_source_target] <- NA
df
a b c d
a 10 NA NA NA
b NA 5 3 NA
c NA NA NA NA
d NA NA NA NA
Upvotes: 1
Reputation: 388807
You could add missing column names first and then rbind
missing rows with NA
.
df[setdiff(master_source_target, names(df))] <- NA
rbind(df,matrix(NA, nrow = length(master_source_target) - nrow(df), ncol = ncol(df),
dimnames = list(setdiff(master_source_target, rownames(df)), names(df))))
# a b c d
#a 10 NA NA NA
#b NA 5 3 NA
#c NA NA NA NA
#d NA NA NA NA
Upvotes: 1