Reputation: 69
Given a dataframe:
df <- data.frame(Col1 = LETTERS[1:4], Col2 = LETTERS[23:26], Col3 = c(1:4), col4 = c(100:103))
I want to combine column with their column names. I know I can use unite
from tidyr and get the following output.
df %>% unite(NewCol, c(Col1, Col4), remove = F)
Col1 Col2 Col3 Col4 NewCol
1 A W 1 100 A_100
2 B X 2 101 B_101
3 C Y 3 102 C_102
4 D Z 4 103 D_103
But I want to have the column name next to the value of the column as follows (the separator _
is really not that important):
Col1 Col2 Col3 Col4 NewCol
1 A W 1 100 Col1_A_Col4_100
2 B X 2 101 Col1_B_Col4_101
3 C Y 3 102 Col1_C_Col4_102
4 D Z 4 103 Col1_D_Col4_103
I tried the solution posted here which does give the desired output but it creates a separate output.
imap_dfr(df %>% select(Col1, Col4), ~ paste(.y, .x, sep = "_")) %>%
unite(NewCol, sep = "_")
NewCol
<chr>
1 Col1_A_Col4_100
2 Col1_B_Col4_101
3 Col1_C_Col4_102
4 Col1_D_Col4_103
Would I simply use bind_cols()
to combine both? How do I know the sequence of the rows is preserved between the two? Is there another way that I can create NewCol
within the same dataframe similar to unite in the first case?
Upvotes: 2
Views: 516
Reputation: 26695
One option is to create temporary 'colname + value' columns, then unite them in a second step, e.g.
## Load libraries
library(tidyverse)
## Load example data
df <- data.frame(Col1 = LETTERS[1:4], Col2 = LETTERS[23:26], Col3 = c(1:4), Col4 = c(100:103))
## Expected outcome
df %>% bind_cols(imap_dfr(df %>% select(Col1, Col4),
~ paste(.y, .x, sep = "_")) %>%
unite(newcol, sep = "_"))
#> Col1 Col2 Col3 Col4 newcol
#> 1 A W 1 100 Col1_A_Col4_100
#> 2 B X 2 101 Col1_B_Col4_101
#> 3 C Y 3 102 Col1_C_Col4_102
#> 4 D Z 4 103 Col1_D_Col4_103
## With a small number of columns
df %>%
mutate(tmp_Col1 = paste0("Col1", "_", Col1),
tmp_Col4 = paste0("Col4", "_", Col4)) %>%
unite(newcol, c(tmp_Col1, tmp_Col4), sep = "_")
#> Col1 Col2 Col3 Col4 newcol
#> 1 A W 1 100 Col1_A_Col4_100
#> 2 B X 2 101 Col1_B_Col4_101
#> 3 C Y 3 102 Col1_C_Col4_102
#> 4 D Z 4 103 Col1_D_Col4_103
## With a large number of columns
df %>%
mutate(across(c(Col1, Col4),
~paste0(cur_column(), "_", .x))) %>%
unite(newcol, c(Col1, Col4), sep = "_") %>%
left_join(df)
#> Joining with `by = join_by(Col2, Col3)`
#> newcol Col2 Col3 Col1 Col4
#> 1 Col1_A_Col4_100 W 1 A 100
#> 2 Col1_B_Col4_101 X 2 B 101
#> 3 Col1_C_Col4_102 Y 3 C 102
#> 4 Col1_D_Col4_103 Z 4 D 103
Created on 2023-06-22 with reprex v2.0.2
If you have a large number of columns you want to transform, using across()
allows you to employ tidyselect functions, such as starts_with()
, to select columns of interest without having to specify each column by name.
Upvotes: 2