Nancy
Nancy

Reputation: 101

Adding columns and insert info from a second dataframe R

hello everyone I have two dataframes and I'd like to join information from one df to another one in a specific way. I'm gonna explain better. Here is my first df where i'd like to add 6 columns (general col named col1, col2 and so on..):

           res1 res4 aa1234 
1 AAAAAA    1    4   IVGG       
2 AAAAAA    8   11   RPRQ        
3 AAAAAA   10   13   RQFP        
4 AAAAAA   12   15   FPFL  
5 AAAAAA   20   23   NQGR      
6 AAAAAA   32   35   HARF    

here is the 2nd df:

   res1     dist
    1     3.711846
    1     3.698985
    2     4.180874
    2     3.112819
    3     3.559737
    3     3.722107
    4     3.842375
    4     3.914970
    5     3.361647
    5     2.982788
    6     3.245118
    6     3.224230
    7     3.538315
    7     3.602273
    8     3.185184
    8     2.771583
    9     4.276871
    9     3.157737
    10    3.933783
    10    2.956738


     

Considering "res1" I'd like to add to the 1st df in my new 6 columns the first 6th values contained in "dist" of second df corresponding to res1 = 1. After, in the 1st df I have res1 = 8, so I'd like to add in the new 6 columns the 6 values from res1 = 8 contained in "dist" of 2nd df.

I'd like to obtain something like this

res1 res4   aa1234  col1 col2 col3 col4 col5 col6
      1    4   IVGG    3.71 3.79 4.18 3.11 3.55 3.72
      8   11   RPRQ    3.18 2.77 4.27 3.15 3.93 2.95
     10   13   RQFP    
     12   15   FPFL        
     20   23   NQGR   
     32   35   HARF        

Please consider that I have to do it on a large dataset and for 1000 and more files... thanks!

Upvotes: 2

Views: 95

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 388862

You could create a sequence from res1 to res4 and then join the data with pdb.

library(tidyverse)

turn %>%
  mutate(res = map2(res1, res4, seq)) %>%
  unnest(res) %>%
  left_join(pdb, by = c('res' = 'res1')) %>%
  group_by(res1 = as.character(res1)) %>%
  mutate(col = paste0('col', row_number())) %>%
  select(-res4, -res, -eleno) %>%
  pivot_wider(names_from = col, values_from = dist)

Upvotes: 1

akrun
akrun

Reputation: 887008

We can use rowid from data.table

library(dplyr)
library(tidyr)
library(data.table)
library(stringr)
df2 %>%
    mutate(col = str_c("col", rowid(res1))) %>%
    pivot_wider(names_from = col, values_from = dist) %>%
    right_join(df1, by = 'res1')

-output

# A tibble: 6 x 4
#   res1  col1  col2  res4
#  <int> <dbl> <dbl> <int>
#1     1  3.71  3.70     4
#2     8  3.19  2.77    11
#3    10  3.93  2.96    13
#4    12 NA    NA       15
#5    20 NA    NA       23
#6    32 NA    NA       35

data

df1 <- structure(list(res1 = c(1L, 8L, 10L, 12L, 20L, 32L), res4 = c(4L, 
11L, 13L, 15L, 23L, 35L)), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6"))

df2 <- structure(list(res1 = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 
6L, 6L, 7L, 7L, 8L, 8L, 9L, 9L, 10L, 10L), dist = c(3.711846, 
3.698985, 4.180874, 3.112819, 3.559737, 3.722107, 3.842375, 3.91497, 
3.361647, 2.982788, 3.245118, 3.22423, 3.538315, 3.602273, 3.185184, 
2.771583, 4.276871, 3.157737, 3.933783, 2.956738)), class = "data.frame",
row.names = c(NA, 
-20L))

Upvotes: 1

Related Questions