Reputation: 11480
I want to "partially" make the data.frame "wider" Not really succeeding in doing so.
ID X1 X2 X3 X4
1l 1 5 9 13
1r 2 6 10 14
2r 3 7 11 15
2l 4 8 12 16
ID X1_l X2_l X3_l X4_l X1_r X2_r X3_r X4_r
1 1 5 9 13 2 6 10 14
2 4 8 12 16 3 7 11 15
df <- data.frame(ID=c("1l","1r","2r","2l"),matrix(1:16,ncol=4))
l
and r
are not always the same. As shown in the example. So no cheating possible here. We could do a natural sort to solve that but let's don't do it. Possibly there is a single r
or l
number. So for e.g., there exists 18l
but NOT 18r
.My try is to first split the data up by same numbers: Like so
split(x = df,f = sub("[lr]","",df$ID))
Any fast solutions for a problem like that?
@Arkun
Error Message:
Error: Duplicate identifiers for rows (7561, 7562), (7581, 7582), (7599, 7600), (7619, 7620), (7563, 7564), (7583, 7584), (7601, 7602), (7621, 7622), (7565, 7566), (7585, 7586), (7603, 7604), (7623, 7624), (7567, 7568), (7587, 7588), (7605, 7606), (7625, 7626), (7569, 7570), (7607, 7608), (7571, 7572), (7589, 7590), (7609, 7610), (7573, 7574), (7591, 7592), (7611, 7612), (7575, 7576), (7593, 7594), (7613, 7614), (7577, 7578), (7595, 7596), (7615, 7616), (7579, 7580), (7597, 7598), (7617, 7618), (6577, 6578), (6597, 6598), (6615, 6616), (6635, 6636), (6579, 6580), (6599, 6600), (6617, 6618), (6637, 6638), (6581, 6582), (6601, 6602), (6619, 6620), (6639, 6640), (6583, 6584), (6603, 6604), (6621, 6622), (6641, 6642), (6585, 6586), (6623, 6624), (6587, 6588), (6605, 6606), (6625, 6626), (6589, 6590), (6607, 6608), (6627, 6628), (6591, 6592), (6609, 6610), (6629, 6630), (6593, 6594), (6611, 6612), (6631, 6632), (6595, 6596), (6613, 6614), (6633, 6634), (7643, 7644), (7663, 7664), (7681, 76
In addition: Warning message:
Too many values at 66 locations: 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, ...
About the real data:
43 patients (1r,1l, ... 43r,43l)
About 112 Variables. First after ID is called v2_1_prot
and last v2_80_CK
So i replaced the part of x1:x4
with v2_1_prot:v2_80_CK
. Am I doing something wrong here?
I checked my ID variable there are not duplicates so far.
all_pat_rbind2$ID %>% count %>% extract2("freq") %>% max # is equal to one
Upvotes: 1
Views: 89
Reputation: 887213
We can use methods from tidyverse
library(dplyr)
library(tidyr)
df %>%
separate(ID, into = c("ID", "v1"), sep="(?<=\\d)(?![0-9])") %>%
gather(key, val, -one_of("ID", "v1")) %>%
unite(key_v1, key, v1) %>%
spread(key_v1, val)
# ID X1_l X1_r X2_l X2_r X3_l X3_r X4_l X4_r
#1 1 1 2 5 6 9 10 13 14
#2 2 4 3 8 7 12 11 16 15
Upvotes: 3
Reputation: 32548
temp = gsub("\\D+", "", df$ID)
data.frame(ID = unique(temp), do.call(cbind, lapply(X = c("l", "r"),
FUN = function(f) setNames(object = data.frame(sapply(X = names(df)[-1],
FUN = function(cx) sapply(X = unique(temp),
FUN = function(rx) df[df$ID == paste0(rx,f),cx]))),
nm = paste(names(df)[-1], f, sep = "_")))))
# ID X1_l X2_l X3_l X4_l X1_r X2_r X3_r X4_r
#1 1 1 5 9 13 2 6 10 14
#2 2 4 8 12 16 3 7 11 15
Upvotes: 3