sbac
sbac

Reputation: 2081

How to transform a tibble from one column to two columns with repeated observations

I tried to transform df into df2. I have done it through a very patchy way using df3, Is there a simpler and more elegant way of doing it?

library(tidyverse)

# I want to transform df
df <- tibble(id = c(1, 2, 1, 2, 1, 2),
             time = c('t1', 't1', 't2', 't2', 't3', 't3'),
             value = c(2, 3, 6, 4, 5, 7))
df
#> # A tibble: 6 x 3
#>      id time  value
#>   <dbl> <chr> <dbl>
#> 1     1 t1        2
#> 2     2 t1        3
#> 3     1 t2        6
#> 4     2 t2        4
#> 5     1 t3        5
#> 6     2 t3        7

# into df2
df2 <- tibble(id = c(1, 2, 1, 2),
              t = c(2, 3, 6, 4),
              r = c(6, 4, 5, 7))
df2
#> # A tibble: 4 x 3
#>      id     t     r
#>   <dbl> <dbl> <dbl>
#> 1     1     2     6
#> 2     2     3     4
#> 3     1     6     5
#> 4     2     4     7


# This is how I did it, but I think it should be a better way
df3 <- df %>% pivot_wider(names_from = time, values_from = value)

b <- tibble(id = numeric(), t = numeric(), r = numeric())
for (i in 2:3){
  a <- df3[,c(1,i,i+1)]
  colnames(a) <- c('id', 't', 'r')
  b <- bind_rows(a, b)
}
b
#> # A tibble: 4 x 3
#>      id     t     r
#>   <dbl> <dbl> <dbl>
#> 1     1     6     5
#> 2     2     4     7
#> 3     1     2     6
#> 4     2     3     4
Created on 2020-11-25 by the reprex package (v0.3.0)

Upvotes: 1

Views: 137

Answers (2)

akrun
akrun

Reputation: 887148

We can use summarise from dplyr version >= 1.0. Previously, it had the constraint of returning only single observation per group. From version >= 1.0, it is no longer the case. Can return any number of rows i.e. it can be shorter or longer than the original number of rows

library(dplyr)
df %>% 
   group_by(id) %>%
   summarise(t = value[-n()], r = value[-1], .groups = 'drop')

-output

# A tibble: 4 x 3
#     id     t     r
#  <dbl> <dbl> <dbl>
#1     1     2     6
#2     1     6     5
#3     2     3     4
#4     2     4     7

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 388982

For each id you can use lead to select next value and create r column and drop NA rows.

library(dplyr)

df %>%
  group_by(id) %>%
  mutate(t = value, 
         r = lead(value)) %>%
  na.omit() %>%
  select(id, t, r)

#     id     t     r
#  <dbl> <dbl> <dbl>
#1     1     2     6
#2     2     3     4
#3     1     6     5
#4     2     4     7

Upvotes: 1

Related Questions