ds_worthington
ds_worthington

Reputation: 121

tidyr::pivot_longer to multiple columns

I need to convert a dataframe to longer form. for example, my dataframe will be like

df <- data.frame(
  group = c("group1","group2"),
  x1 = c(3,4),
  x2 = c(5,6),
  y1 = c(7,8),
  y2 = c(9,10)
)

I need to convert it to

df2 <- data.frame(
  group = c("group1","group1","group2","group2"),
  num = c(1,2,1,2),
  x = c(3,5,4,6),
  y = c(7,9,8,10)
) 

I want to use tidyr::pivot_longer but couldn't figure out names_pattern here. Am I on the right track?

df_2<-df%>%tidyr::pivot_longer(c("x1","x2","y1","y2"),
                               names_to = c("x","y"),
                               names_pattern = "")

Any help? Thanks.

Upvotes: 3

Views: 1583

Answers (4)

ThomasIsCoding
ThomasIsCoding

Reputation: 102625

Here is a base R option using reshape

reshape(
  setNames(df, gsub("(\\d+)$", ".\\1", names(df))),
  direction = "long",
  idvar = "group",
  varying = -1,
  timevar = "n"
)

which gives

          group n x  y
group1.1 group1 1 3  7
group2.1 group2 1 4  8
group1.2 group1 2 5  9
group2.2 group2 2 6 10

Upvotes: 0

Sravan
Sravan

Reputation: 11

This is also a way to solve this problem.using gather ,separate and spread functions

      df%>%
       gather(num,"x1","x2","y1","y2",-(group),value = 
       values)%>%
       separate(num,into=c("var",'val'),sep=1)%>%
       spread(key='var',value=values)

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 389235

You can pass regex in names_pattern -

tidyr::pivot_longer(df,cols = -group, 
                    names_to = c('.value', 'num'), 
                    names_pattern = '([a-zA-Z]+)(\\d+)')

#  group  num       x     y
#  <chr>  <chr> <dbl> <dbl>
#1 group1 1         3     7
#2 group1 2         5     9
#3 group2 1         4     8
#4 group2 2         6    10

.value has a special meaning in pivot_longer which suggests that part of the column name from original dataframe would be output dataframe's column name. The part is decided using names_pattern argument which provides a regex pattern in the form of capture group.

The first capture group ([a-zA-Z]+) captures all the alphabets from the column names i.e x from x1, x from x2, y from y1 and so on. As I have used + in the regex it will also work if you have column names which is more than 1 character (eg - col1, col2 etc.). The second capture group is used to capture the numbers from the column name i.e 1 from x1, 2 from x2 and so on.

Upvotes: 5

akrun
akrun

Reputation: 887831

We can use names_sep with a regex lookaround

library(tidyr)
pivot_longer(df, cols = -group, names_to = c(".value", "num"),
     names_sep = "(?<=[a-z])(?=\\d)")
# A tibble: 4 x 4
  group  num       x     y
  <chr>  <chr> <dbl> <dbl>
1 group1 1         3     7
2 group1 2         5     9
3 group2 1         4     8
4 group2 2         6    10

Upvotes: 1

Related Questions