KLM117
KLM117

Reputation: 467

Match every row in a dataframe to each row in another dataframe in r

This might be a simple question, but I couldn't seem to find an obvious solution.

I have two data frames, df1 with 64 rows, and df2 with 662,343 rows. I join df1 to df2, where every row in df1 is mapped to each row in df2 so that I have 42,389,952 rows. df1 and df2 might look like this respectively:

df1: | Cancer | ID | |---------------------|------------------| | Sarcoma | 3435 | | Leukemia | 4465 |

df2:

Gene
TP53

new data frame :

Cancer ID Gene
Sarcoma 3435 TP53
Leukemia 4465 TP53

Thanks in advance for any help!

Upvotes: 1

Views: 2314

Answers (3)

akrun
akrun

Reputation: 887108

We may use merge

merge(df2, df, all = TRUE)

-ouptut

A B X
1 a X 1
2 b Y 1
3 c Z 1
4 a X 2
5 b Y 2
6 c Z 2

data

df <- data.frame(X = c(1, 2))

df2 <- data.frame(A = letters[1:3],
                  B = LETTERS[24:26])

Upvotes: 2

AnilGoyal
AnilGoyal

Reputation: 26218

You may full_join without any matching column. So use by = character() in matching column argument. Demo

df <- data.frame(X = c(1, 2))

df2 <- data.frame(A = letters[1:3],
                  B = LETTERS[24:26])
df
#>   X
#> 1 1
#> 2 2
df2
#>   A B
#> 1 a X
#> 2 b Y
#> 3 c Z

dplyr::full_join(df2, df, by = character())
#>   A B X
#> 1 a X 1
#> 2 a X 2
#> 3 b Y 1
#> 4 b Y 2
#> 5 c Z 1
#> 6 c Z 2

Created on 2021-06-26 by the reprex package (v2.0.0)

Upvotes: 6

Karthik S
Karthik S

Reputation: 11584

I think you are looking for cartesian product and not left join:

library(tidyr)
expand_grid(df1,df2)
# A tibble: 2 x 3
  Cancer      ID Gene 
  <chr>    <dbl> <chr>
1 Sarcoma   3425 TP53 
2 Leukemia  4465 TP53 

Upvotes: 2

Related Questions