Al M
Al M

Reputation: 51

Paste two columns with the same name in the same dataframe in R

I am new here. I am struggling with R. I have a dataframe with different columns under the same name and I would like to paste them together using a "-".

Reproducible sample data:

df <- tribble(
  ~Region,  ~Length_House1,   ~Length_House1,   ~Length_House2,   ~Length_House2,   ~Length_House3,   ~Length_House3,
  "Montana", 20,  30,  20,  20,  40,  50,
  "Montana", 52,  64,  60,  60,  70,  76,
  "Montana", 52,  68,  60,  60,  70,  70,
  "Montana", 44,  52,  60,  60,  76,  76,
  "Montana", 44,  76,  60,  60,  70,  76,
  "Idaho",   48,  56,  60,  60,  76,  76,
  "Idaho",   48,  72,  60,  60,  70,  76
)

Desired Output

   Region Length_House1 Length_House2 Length_House3
 Montana         20-30         20-20         40-50
 Montana         52-64         60-60         70-76
 Montana         52-68         60-60         70-70
 Montana         44-52         60-60         76-76
 Montana         44-76         60-60         70-76
   Idaho         48-56         60-60         76-76
   Idaho         48-72         60-60         70-76

Upvotes: 2

Views: 281

Answers (3)

TarJae
TarJae

Reputation: 78917

Here is another solution with tidyr:

  1. Make unique colnames
  2. unite the columns
library(tidyr)
library(dplyr)
uniquecolnames <- c(sprintf("f%02d", seq(1,7)),"label")
colnames(df) <- uniquecolnames

df %>% 
  unite("Length_House1", 2:3, sep="-") %>% 
  unite("Length_House2", 3:4, sep="-") %>% 
  unite("Length_House3", 4:5, sep="-") 

output:

  f01     Length_House1 Length_House2 Length_House3
  <chr>   <chr>         <chr>         <chr>        
1 Montana 20-30         20-20         40-50        
2 Montana 52-64         60-60         70-76        
3 Montana 52-68         60-60         70-70        
4 Montana 44-52         60-60         76-76        
5 Montana 44-76         60-60         70-76        
6 Idaho   48-56         60-60         76-76        
7 Idaho   48-72         60-60         70-76   

Upvotes: 1

Anoushiravan R
Anoushiravan R

Reputation: 21908

Special Thanks to my teachers and friends @Ronak Shah and @AnilGoyal who taught me a very fabulous solution with regard to Using get with glue function.

Here is a tidyverse approach that might be of interest to you. For this solution I first had to change the name of the second of twin columns in order to facilitate the process of data manipulation.

library(dplyr)
library(stringr)
library(purrr)
library(glue)

df %>%
  select(c(2, 4, 6)) %>%
  rename_with(., ~ str_remove(.x, fixed("Length_")), .cols = everything()) %>%
  bind_cols(df %>%
              select(-c(2, 4, 6))) %>% 
  relocate(Region) %>%
  mutate(map_dfc(list(Length_House_1 = 1,
                      Length_House_2 = 2, 
                      Length_House_3 = 3), ~ paste(get(glue("House{.x}")), get(glue("Length_House{.x}")), 
                          sep = "_"))) %>%
  select(-c(2:7))


  Region  Length_House_1 Length_House_2 Length_House_3
  <chr>   <chr>          <chr>          <chr>         
1 Montana 20_30          20_20          40_50         
2 Montana 52_64          60_60          70_76         
3 Montana 52_68          60_60          70_70         
4 Montana 44_52          60_60          76_76         
5 Montana 44_76          60_60          70_76         
6 Idaho   48_56          60_60          76_76         
7 Idaho   48_72          60_60          70_76 

Data:

df <- tribble(
  ~Region,  ~Length_House1,   ~Length_House1,   ~Length_House2,   ~Length_House2,   ~Length_House3,   ~Length_House3,
  "Montana", 20,  30,  20,  20,  40,  50,
  "Montana", 52,  64,  60,  60,  70,  76,
  "Montana", 52,  68,  60,  60,  70,  70,
  "Montana", 44,  52,  60,  60,  76,  76,
  "Montana", 44,  76,  60,  60,  70,  76,
  "Idaho",   48,  56,  60,  60,  76,  76,
  "Idaho",   48,  72,  60,  60,  70,  76
)

Upvotes: 1

Ben
Ben

Reputation: 30474

If you specify which columns to combine explicitly (e.g., 2nd, 4th, and 6th with 3rd, 5th, and 7th), you can try:

as.data.frame(cbind(Region = df$Region, mapply(paste, 
                                               df[, c(2,4,6)], 
                                               df[, c(3,5,7)], sep = '-')))

Here, you can use mapply to paste columns together row by row, separated by -.

If the column names include a common characters in the same pair to be combined (e.g., Length_HouseA1 with Length_HouseA2, Length_HouseB1 with Length_HouseB2 or alternative), then you could try:

data.frame(lapply(split.default(df[-1],
           sub("\\d+$", "", names(df)[-1])), function(x) do.call(paste, c(x, sep="-"))))

Output

   Region Length_House1 Length_House2 Length_House3
1 Montana         20-30         20-20         40-50
2 Montana         52-64         60-60         70-76
3 Montana         52-68         60-60         70-70
4 Montana         44-52         60-60         76-76
5 Montana         44-76         60-60         70-76
6   Idaho         48-56         60-60         76-76
7   Idaho         48-72         60-60         70-76

Upvotes: 2

Related Questions