Reputation: 69
I want to transform a data.frame
in R from looking like this:
W Well Name Ct (dR)
A1 F_0.2,R_0.2 23.72
A2 F_0.2,R_0.2 23.82
A3 F_0.2,R_0.4 22.87
A4 F_0.2,R_0.4 23.1
B1 F_0.4,R_0.2 23.74
B2 F_0.4,R_0.2 24.01
B3 F_0.4,R_0.4 22.86
B4 F_0.4,R_0.4 22.79
C1 F_0.6,R_0.2 23.5
C2 F_0.6,R_0.2 No Ct
C3 F_0.6,R_0.4 23.03
C4 F_0.6,R_0.4 22.61
D1 F_0.8,R_0.2 23.83
D2 F_0.8,R_0.2 23.67
D3 F_0.8,R_0.4 22.48
D4 F_0.8,R_0.4 22.41
into this:
1 2 3 4
A 23.72 23.82 22.87 23.1
B 23.74 24.01 22.86 22.79
C 23.5 No Ct 23.03 22.61
D 23.83 23.67 22.48 22.41
Even better, like this:
R_0.2 R_0.4 R_0.6 R_0.8
F_0.2 23.72 23.82 22.87 23.1
F_0.4 23.74 24.01 22.86 22.79
F_0.6 23.5 No Ct 23.03 22.61
F_0.8 23.83 23.67 22.48 22.41
There are actually 12 columns and 8 rows, but you get the point :) Also - what is this kind of data sorting called?
Upvotes: 1
Views: 51
Reputation: 887811
We separate
the 'W' column into two, then drop the 'Well Name' column with select
and spread
to 'wide' format
library(tidyverse)
df1 %>%
separate(W, into = c("W1", "W2"), sep="(?=\\d+)") %>%
select(-`Well Name`) %>%
spread(W2, `Ct (dR)`)
# W1 1 2 3 4
#1 A 23.72 23.82 22.87 23.1
#2 B 23.74 24.01 22.86 22.79
#3 C 23.5 No Ct 23.03 22.61
#4 D 23.83 23.67 22.48 22.41
It seems that the 'Well Name' column values in the input dataset is not matching with the expected output. Assuming that the values are as in the data below
df1 %>%
separate(`Well Name`, into = c("WN1", "WN2"), sep=",") %>%
select(-W) %>%
spread(WN2, `Ct (dR)`)
# WN1 R_0.2 R_0.4 R_0.6 R_0.8
#1 F_0.2 23.72 23.82 22.87 23.1
#2 F_0.4 23.74 24.01 22.86 22.79
#3 F_0.6 23.5 No Ct 23.03 22.61
#4 F_0.8 23.83 23.67 22.48 22.41
df1 <- structure(list(W = c("A1", "A2", "A3", "A4", "B1", "B2", "B3",
"B4", "C1", "C2", "C3", "C4", "D1", "D2", "D3", "D4"), `Well Name` = c("F_0.2,R_0.2",
"F_0.2,R_0.4", "F_0.2,R_0.6", "F_0.2,R_0.8", "F_0.4,R_0.2", "F_0.4,R_0.4",
"F_0.4,R_0.6", "F_0.4,R_0.8", "F_0.6,R_0.2", "F_0.6,R_0.4", "F_0.6,R_0.6",
"F_0.6,R_0.8", "F_0.8,R_0.2", "F_0.8,R_0.4", "F_0.8,R_0.6", "F_0.8,R_0.8"
), `Ct (dR)` = c("23.72", "23.82", "22.87", "23.1", "23.74",
"24.01", "22.86", "22.79", "23.5", "No Ct", "23.03", "22.61",
"23.83", "23.67", "22.48", "22.41")), .Names = c("W", "Well Name",
"Ct (dR)"), row.names = c(NA, -16L), class = "data.frame")
Upvotes: 1