adajam
adajam

Reputation: 221

How can I match the colnames of one df with the values found in another - and if found, rename those cols with further values?

I have a "main_df" along the lines of this, where each colname consists of a unqiue url code:

          main_df <- tibble("eW1BRoUDV4BKQMl" = c("word", "word", "word", "word", "word"),
                            "5zKTGwHlwlzpssB" = c("word", "word", "word", "word", "word"),
                            "55SVfoQudZJNCFT" = c("word", "word", "word", "word", "word"),
                            "bOORR1zuKYSnAe9" = c("word", "word", "word", "word", "word"),
                            "6RrOQfDZim81pHv" = c("word", "word", "word", "word", "word"))

# A tibble: 5 x 5
  eW1BRoUDV4BKQMl `5zKTGwHlwlzpssB` `55SVfoQudZJNCFT` bOORR1zuKYSnAe9 `6RrOQfDZim81pHv`
  <chr>           <chr>             <chr>             <chr>           <chr>            
1 word            word              word              word            word             
2 word            word              word              word            word             
3 word            word              word              word            word             
4 word            word              word              word            word             
5 word            word              word              word            word  

I also have a second "reference_df" that also contains the same unique urls, along with their corresponding sensible "item" name:

          reference_df <- tibble(item = c("ashtray", "bell", "blouse", "boot", "bottle"), 
                           url = c("eW1BRoUDV4BKQMl", "5zKTGwHlwlzpssB", "55SVfoQudZJNCFT", "bOORR1zuKYSnAe9", "6RrOQfDZim81pHv"))

# A tibble: 5 x 2
  item    url            
  <chr>   <chr>          
1 ashtray eW1BRoUDV4BKQMl
2 bell    5zKTGwHlwlzpssB
3 blouse  55SVfoQudZJNCFT
4 boot    bOORR1zuKYSnAe9
5 bottle  6RrOQfDZim81pHv

I need to match the colnames of "main_df" with the values in "reference_df$url", and if found, replace the colname of "main_df" with "reference_df$item". The desired output of my "main_df" would therefore be:

    # A tibble: 5 x 5
  ashtray bell  blouse boot  bottle
  <chr>   <chr> <chr>  <chr> <chr> 
1 word    word  word   word  word  
2 word    word  word   word  word  
3 word    word  word   word  word  
4 word    word  word   word  word  
5 word    word  word   word  word  

I could probably do this with a for loop, but I was wondering if there's a way around that. A tidyverse solution would be preferable - I'm pretty sure I should be using apply or map_df(), but I just can't figure it out.

Any help would be much appreciated!

Upvotes: 0

Views: 31

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389135

We could match the column name of main_df with url of reference_df and change the names of main_df.

names(main_df) <- reference_df$item[match(names(main_df), reference_df$url)]

# ashtray bell  blouse boot  bottle
#  <chr>   <chr> <chr>  <chr> <chr> 
#1 word    word  word   word  word  
#2 word    word  word   word  word  
#3 word    word  word   word  word  
#4 word    word  word   word  word  
#5 word    word  word   word  word  

A dplyr way would be to arrange columns based on url in reference_df and rename.

library(dplyr)

main_df %>% select(reference_df$url) %>% rename_all(~reference_df$item)

Upvotes: 2

Related Questions