Reputation: 2256
I need to sort values for some columns per row, but only if some other reference columns for same row have identical values (skipping NA and NULL). The columns are of different types (to sort are characters, reference are lists).
MWE below. Only row 2 should NOT be sorted. DT.is is the data.table to use, and DT.want is what is needed.
library(data.table)
DT.is <- data.table(Col_A=c("V2", "V2", "V2", "V2"),
Col_B=c("V3", "V3", "V1", "V1"),
Col_C=c("V1", "V1", NA_character_, NA_character_),
Ref_A=list(c("RA", "RB"), "RB", "RA", "RA"),
Ref_B=list(c("RA", "RB"), c("RA", "RB"), "RA", "RA"),
Ref_C=list(c("RA", "RB"), "RA", NULL, NA_character_))
DT.want <- data.table(Col_A=c("V1", "V2", "V1", "V1"),
Col_B=c("V2", "V3", "V2", "V2"),
Col_C=c("V3", "V1", NA_character_, NA_character_),
Ref_A=list(c("RA", "RB"), "RB", "RA", "RA"),
Ref_B=list(c("RA", "RB"), c("RA", "RB"), "RA", "RA"),
Ref_C=list(c("RA", "RB"), "RA", NULL, NA_character_))
The original data.table has many rows.
Upvotes: 1
Views: 75
Reputation: 47300
It's been quite a ride :) :
I have no idea how to do it with data.table
so here's a tidyverse
solution :
library(tidyverse)
DT.is %>%
rownames_to_column %>%
gather(key,value,-1) %>%
separate(key,c("type","V")) %>%
spread(type,value) %>%
mutate(Ref2=as.character(Ref),
Col = unlist(Col)) %>%
group_by(rowname,Ref2) %>%
mutate(Col = sort(Col,na.last=TRUE)) %>%
ungroup %>%
select(-Ref2) %>%
gather(type,value,Col,Ref) %>%
unite(key,type,V) %>%
spread(key,value) %>%
as.data.table
# rowname Col_A Col_B Col_C Ref_A Ref_B Ref_C
# 1: 1 V1 V2 V3 RA,RB RA,RB RA,RB
# 2: 2 V2 V3 V1 RB RA,RB RA
# 3: 3 V1 V2 NA RA RA NULL
# 4: 4 V1 V2 NA RA RA NA
I'm first converting to long format because this way i'll be able to group by your Ref
values. Then I sort by initial row id and Ref
value (converted to character temporarily as one can't group by list) and then I climb all the way abck to initial format.
Upvotes: 1