Ville Lehtonen
Ville Lehtonen

Reputation: 93

Append row names from one dataframe to another with different dimensions in R

I would need some help with adding row names from one dataframe to another.

For the sake of simplicity, say I have two dataframes (df1 and df2) with different dimensions (df1 is 3x3 and df2 is 5x5). In reality my dataframes are a lot bigger (i.e. thousands of rows / columns)

df1 <- data.frame("rownames" = c("A", "B", "C"), "a1" = c(0,1,2), "a2" = c(2,0,1), "a3" = c(0,0,1), row.names = "rownames")

df2 <- data.frame("rownames" = c("A", "B", "D", "E", "F"), "a1" = c(1,1,2,2,0), a2 = c(2,0,0,1,0), a3 = c(1,0,2,3,0), a4 = c(1,1,0,0,1), a5 = c(0,0,0,0,0), row.names = "rownames")

What I'd like to do is to append the rows of df1 to include the row names "D", "E", and "F" that are in df2 but not in df1, in such a way that the column ("a1", "a2", "a3") values would be set to zeros.

So the input would be the two dataframes:

df1
  a1 a2 a3
A  0  2  0
B  1  0  0
C  2  1  1

df2
  a1 a2 a3 a4 a5
A  1  2  1  1  0
B  1  0  0  1  0
D  2  0  2  0  0
E  2  1  3  0  0
F  0  0  0  1  0

and the desired output would be:

  a1 a2 a3
A  0  2  0
B  1  0  0
C  2  1  1
D  0  0  0
E  0  0  0
F  0  0  0

Thank you!

Upvotes: 0

Views: 775

Answers (2)

akrun
akrun

Reputation: 887991

We can use %in% with negate (!)

df1[row.names(df2)[!row.names(df2) %in% row.names(df1)], ] <- 0
df1
#  a1 a2 a3
#A  0  2  0
#B  1  0  0
#C  2  1  1
#D  0  0  0
#E  0  0  0
#F  0  0  0

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 389355

If you know that df1 is going to be smaller dataframe and df2 bigger you can do :

df1[setdiff(rownames(df2), rownames(df1)), ] <- 0
df1

#  a1 a2 a3
#A  0  2  0
#B  1  0  0
#C  2  1  1
#D  0  0  0
#E  0  0  0
#F  0  0  0

In case, if you have to programatically determine the dataframe which is bigger/smaller you can test it with if condition

if(nrow(df1) > nrow(df2)) {
  small_df <- df2
  big_df <- df1
} else {
  small_df <- df1
  big_df <- df2
}

small_df[setdiff(rownames(big_df), rownames(small_df)), ] <- 0

Upvotes: 2

Related Questions