Reputation: 13

Reshaping of data

I am stuck with reshaping data in R and I hope someone could help me out. The data looks like this:

ID	measurement	biomarker_x	biomarker_y
1	1	10	100
1	2	11	110
1	3	12	120
2	1	20	200
2	2	19	190
2	3	21	210

And needs to be reshaped to looking like this:

ID	biomarker	measurement1	measurement2	measurement3
1	x	10	11	12
1	y	100	110	120
2	x	20	19	21
2	y	200	190	210

I tried to work with tidyr::gather and spread and with pivot_wider and pivot_longer but failed. If someone would have a solution for applying this on multiple biomarkers I would be very thankful.

Upvotes: 1

Answers (5)

ThomasIsCoding

Reputation: 102890

A pure base R option using nested ´reshape`

reshape(
    reshape(
        df,
        direction = "long",
        idvar = c("ID", "measurement"),
        varying = -(1:2),
        sep = "_"
    ),
    direction = "wide",
    idvar = c("ID", "time"),
    timevar = "measurement"
)

gives

      ID time biomarker.1 biomarker.2 biomarker.3
1.1.x  1    x          10          11          12
2.1.x  2    x          20          19          21
1.1.y  1    y         100         110         120
2.1.y  2    y         200         190         210

Upvotes: 0

akrun

Reputation: 887961

Using recast from reshape2

library(reshape2)
names(df1)[-(1:2)] <- sub("biomarker_", "", names(df1)[-(1:2)])
reshape2::recast(df1, id.var = c("ID", "measurement"), 
     ID + variable ~ paste0('measurement', measurement), value.var = 'value')

-output

  ID variable measurement1 measurement2 measurement3
1  1        x           10           11           12
2  1        y          100          110          120
3  2        x           20           19           21
4  2        y          200          190          210

data

df1 <- structure(list(ID = c(1L, 1L, 1L, 2L, 2L, 2L), measurement = c(1L, 
2L, 3L, 1L, 2L, 3L), biomarker_x = c(10L, 11L, 12L, 20L, 19L, 
21L), biomarker_y = c(100L, 110L, 120L, 200L, 190L, 210L)), 
class = "data.frame", row.names = c(NA, 
-6L))

Upvotes: 1

AnilGoyal

Reputation: 26238

can be done in tidyr only

library(tidyr)

df <- read.table(header = T, text = 'ID measurement biomarker_x biomarker_y
1   1   10  100
1   2   11  110
1   3   12  120
2   1   20  200
2   2   19  190
2   3   21  210')

df %>% pivot_longer(starts_with('biomarker'), names_to = 'biomarker', names_prefix = 'biomarker_') %>%
  pivot_wider(names_from = measurement, values_from = value, names_prefix = 'measurement_')

#> # A tibble: 4 x 5
#>      ID biomarker measurement_1 measurement_2 measurement_3
#>   <int> <chr>             <int>         <int>         <int>
#> 1     1 x                    10            11            12
#> 2     1 y                   100           110           120
#> 3     2 x                    20            19            21
#> 4     2 y                   200           190           210

^{Created on 2021-07-06 by the reprex package (v2.0.0)}

Upvotes: 3

Zaw

Reputation: 1474

Here is one approach.

library(tidyverse)

dat |> 
  pivot_longer(
    cols = starts_with("bio"),
    names_to = "biomarker"
  ) |> 
  mutate(biomarker = str_remove(biomarker, "biomarker_")) |> 
  pivot_wider(
    names_from = measurement,
    values_from = value,
    names_prefix = "measurement"
  )

# # A tibble: 4 x 5
#      ID biomarker measurement1 measurement2 measurement3
#   <int> <chr>            <int>        <int>        <int>
# 1     1 x                   10           11           12
# 2     1 y                  100          110          120
# 3     2 x                   20           19           21
# 4     2 y                  200          190          210

Upvotes: 1

Karthik S

Reputation: 11548

Does this work:

library(dplyr)
library(tidyr)
library(stringr)


df %>% pivot_longer(-c(ID, measurement), names_to = 'biomarker') %>% mutate(biomarker = str_extract(biomarker, '[xy]$')) %>% 
  pivot_wider(c(ID, biomarker), names_from = measurement, names_prefix = 'measurement', values_from = value)
# A tibble: 4 x 5
     ID biomarker measurement1 measurement2 measurement3
  <int> <chr>            <int>        <int>        <int>
1     1 x                   10           11           12
2     1 y                  100          110          120
3     2 x                   20           19           21
4     2 y                  200          190          210

Upvotes: 1

Reshaping of data

Answers (5)

data

Related Questions