Reputation: 11
I have 54399 cases, and 2 channels (HOM and HOS), and I want to use multichannel sequence analysis, the data example is as follows:
ID | HOM1 | HOM2 | HOM3 | HOM4 | HOS1 | HOS2 | HOS3 | HOS4 |
---|---|---|---|---|---|---|---|---|
1 | A | A | B | C | NO | YES | NO | NO |
2 | A | B | A | A | YES | UNCERTAIN | YES | YES |
I used code:
HOM.seq<-seqdef(df[, 2:5])
HOS.seq<-seqdef(df[, 6:9])
channels<-list(HOM.seq, HOS.seq)
MDdist<-seqMD(channels, method="OM", sm=list("TRATE", "TRATE"), what="diss")
However, it gets warning that the "52322 unique sequences exceeds max allowed of 46340
My question is how to use wcAggregateCaese to reduce the number of unique sequences? even though this 52322 seems it has already been aggregated from 54399 sequences. Or can I use wcaggregatecase for HOM and HOS before put them in the channel list? Thanks
I have used wcAggregateCases separately for HOM and HOS and the aggregate cases are around 10000 for HOM and 7000 for HOS
Upvotes: 1
Views: 31
Reputation: 1732
You can compute the weights and unique sequences using the combined sequence object. This combined sequence combines at each time position the states from the different channels. Here is an example on how to do so
library(TraMineR)
data(biofam)
## Building one channel per type of event left home, married, and child
bf <- as.matrix(biofam[, 10:25])
left <- bf==1 | bf==3 | bf==5 | bf==6
married <- bf == 2 | bf== 3 | bf==6
children <- bf==4 | bf==5 | bf==6
## Building sequence objects
left.seq <- seqdef(left)
marr.seq <- seqdef(married)
child.seq <- seqdef(children)
channels <- list(LeftHome=left.seq, Marr=marr.seq, Child=child.seq)
## Retrieving the MD sequences or combined sequence
MDseq <- seqMD(channels)
## Now you have one sequence made by combining the different channels.
alphabet(MDseq)
## Use wcAggregateCases() on the combined sequence
library(WeightedCluster)
ac <- wcAggregateCases(MDseq)
print(ac)
## Retrieving unique cases in the original data set
uniqueChannels <- list(LeftHome=left.seq[ac$aggIndex, ], Marr=marr.seq[ac$aggIndex, ], Child=child.seq[ac$aggIndex, ])
## Distance on unique data
MDdist <- seqMD(uniqueChannels, method="OM", sm=list("TRATE", "TRATE"), what="diss")
Upvotes: 0