Reputation: 121
I need to convert a dataframe to longer form. for example, my dataframe will be like
df <- data.frame(
group = c("group1","group2"),
x1 = c(3,4),
x2 = c(5,6),
y1 = c(7,8),
y2 = c(9,10)
)
I need to convert it to
df2 <- data.frame(
group = c("group1","group1","group2","group2"),
num = c(1,2,1,2),
x = c(3,5,4,6),
y = c(7,9,8,10)
)
I want to use tidyr::pivot_longer but couldn't figure out names_pattern here. Am I on the right track?
df_2<-df%>%tidyr::pivot_longer(c("x1","x2","y1","y2"),
names_to = c("x","y"),
names_pattern = "")
Any help? Thanks.
Upvotes: 3
Views: 1583
Reputation: 102625
Here is a base R option using reshape
reshape(
setNames(df, gsub("(\\d+)$", ".\\1", names(df))),
direction = "long",
idvar = "group",
varying = -1,
timevar = "n"
)
which gives
group n x y
group1.1 group1 1 3 7
group2.1 group2 1 4 8
group1.2 group1 2 5 9
group2.2 group2 2 6 10
Upvotes: 0
Reputation: 11
This is also a way to solve this problem.using gather ,separate and spread functions
df%>%
gather(num,"x1","x2","y1","y2",-(group),value =
values)%>%
separate(num,into=c("var",'val'),sep=1)%>%
spread(key='var',value=values)
Upvotes: 0
Reputation: 389235
You can pass regex in names_pattern
-
tidyr::pivot_longer(df,cols = -group,
names_to = c('.value', 'num'),
names_pattern = '([a-zA-Z]+)(\\d+)')
# group num x y
# <chr> <chr> <dbl> <dbl>
#1 group1 1 3 7
#2 group1 2 5 9
#3 group2 1 4 8
#4 group2 2 6 10
.value
has a special meaning in pivot_longer
which suggests that part of the column name from original dataframe would be output dataframe's column name. The part is decided using names_pattern
argument which provides a regex pattern in the form of capture group.
The first capture group ([a-zA-Z]+)
captures all the alphabets from the column names i.e x
from x1
, x
from x2
, y
from y1
and so on. As I have used +
in the regex it will also work if you have column names which is more than 1 character (eg - col1
, col2
etc.). The second capture group is used to capture the numbers from the column name i.e 1
from x1
, 2
from x2
and so on.
Upvotes: 5
Reputation: 887831
We can use names_sep
with a regex lookaround
library(tidyr)
pivot_longer(df, cols = -group, names_to = c(".value", "num"),
names_sep = "(?<=[a-z])(?=\\d)")
# A tibble: 4 x 4
group num x y
<chr> <chr> <dbl> <dbl>
1 group1 1 3 7
2 group1 2 5 9
3 group2 1 4 8
4 group2 2 6 10
Upvotes: 1