Reputation: 51
I am new here. I am struggling with R. I have a dataframe with different columns under the same name and I would like to paste them together using a "-".
Reproducible sample data:
df <- tribble(
~Region, ~Length_House1, ~Length_House1, ~Length_House2, ~Length_House2, ~Length_House3, ~Length_House3,
"Montana", 20, 30, 20, 20, 40, 50,
"Montana", 52, 64, 60, 60, 70, 76,
"Montana", 52, 68, 60, 60, 70, 70,
"Montana", 44, 52, 60, 60, 76, 76,
"Montana", 44, 76, 60, 60, 70, 76,
"Idaho", 48, 56, 60, 60, 76, 76,
"Idaho", 48, 72, 60, 60, 70, 76
)
Desired Output
Region Length_House1 Length_House2 Length_House3
Montana 20-30 20-20 40-50
Montana 52-64 60-60 70-76
Montana 52-68 60-60 70-70
Montana 44-52 60-60 76-76
Montana 44-76 60-60 70-76
Idaho 48-56 60-60 76-76
Idaho 48-72 60-60 70-76
Upvotes: 2
Views: 281
Reputation: 78917
Here is another solution with tidyr
:
library(tidyr)
library(dplyr)
uniquecolnames <- c(sprintf("f%02d", seq(1,7)),"label")
colnames(df) <- uniquecolnames
df %>%
unite("Length_House1", 2:3, sep="-") %>%
unite("Length_House2", 3:4, sep="-") %>%
unite("Length_House3", 4:5, sep="-")
output:
f01 Length_House1 Length_House2 Length_House3
<chr> <chr> <chr> <chr>
1 Montana 20-30 20-20 40-50
2 Montana 52-64 60-60 70-76
3 Montana 52-68 60-60 70-70
4 Montana 44-52 60-60 76-76
5 Montana 44-76 60-60 70-76
6 Idaho 48-56 60-60 76-76
7 Idaho 48-72 60-60 70-76
Upvotes: 1
Reputation: 21908
Special Thanks to my teachers and friends @Ronak Shah and @AnilGoyal who taught me a very fabulous solution with regard to Using get
with glue
function.
Here is a tidyverse approach that might be of interest to you. For this solution I first had to change the name of the second of twin columns in order to facilitate the process of data manipulation.
library(dplyr)
library(stringr)
library(purrr)
library(glue)
df %>%
select(c(2, 4, 6)) %>%
rename_with(., ~ str_remove(.x, fixed("Length_")), .cols = everything()) %>%
bind_cols(df %>%
select(-c(2, 4, 6))) %>%
relocate(Region) %>%
mutate(map_dfc(list(Length_House_1 = 1,
Length_House_2 = 2,
Length_House_3 = 3), ~ paste(get(glue("House{.x}")), get(glue("Length_House{.x}")),
sep = "_"))) %>%
select(-c(2:7))
Region Length_House_1 Length_House_2 Length_House_3
<chr> <chr> <chr> <chr>
1 Montana 20_30 20_20 40_50
2 Montana 52_64 60_60 70_76
3 Montana 52_68 60_60 70_70
4 Montana 44_52 60_60 76_76
5 Montana 44_76 60_60 70_76
6 Idaho 48_56 60_60 76_76
7 Idaho 48_72 60_60 70_76
Data:
df <- tribble(
~Region, ~Length_House1, ~Length_House1, ~Length_House2, ~Length_House2, ~Length_House3, ~Length_House3,
"Montana", 20, 30, 20, 20, 40, 50,
"Montana", 52, 64, 60, 60, 70, 76,
"Montana", 52, 68, 60, 60, 70, 70,
"Montana", 44, 52, 60, 60, 76, 76,
"Montana", 44, 76, 60, 60, 70, 76,
"Idaho", 48, 56, 60, 60, 76, 76,
"Idaho", 48, 72, 60, 60, 70, 76
)
Upvotes: 1
Reputation: 30474
If you specify which columns to combine explicitly (e.g., 2nd, 4th, and 6th with 3rd, 5th, and 7th), you can try:
as.data.frame(cbind(Region = df$Region, mapply(paste,
df[, c(2,4,6)],
df[, c(3,5,7)], sep = '-')))
Here, you can use mapply
to paste
columns together row by row, separated by -
.
If the column names include a common characters in the same pair to be combined (e.g., Length_HouseA1 with Length_HouseA2, Length_HouseB1 with Length_HouseB2 or alternative), then you could try:
data.frame(lapply(split.default(df[-1],
sub("\\d+$", "", names(df)[-1])), function(x) do.call(paste, c(x, sep="-"))))
Output
Region Length_House1 Length_House2 Length_House3
1 Montana 20-30 20-20 40-50
2 Montana 52-64 60-60 70-76
3 Montana 52-68 60-60 70-70
4 Montana 44-52 60-60 76-76
5 Montana 44-76 60-60 70-76
6 Idaho 48-56 60-60 76-76
7 Idaho 48-72 60-60 70-76
Upvotes: 2