Reputation: 31
I have data representing the paths people take across a fixed set of points (discrete, e.g., nodes and edges). So far I have been using igraph
.
I haven't found a good way yet (in igraph
or another package) to create canonical paths
summarizing what significant sub-groups of respondents are doing.
A canonical path
can be operationalized in any reasonable way and is just meant to represent a typical path or sub-path for a significant portion of the population.
Does there already exist a function to create these within igraph
or another package?
Upvotes: 1
Views: 132
Reputation: 43440
One option: represent each person's movement as a directed edge. Create an aggregate graph such that each edge has a weight corresponding to the number of times that edge occurred. Those edges with large weights will be "typical" 1-paths.
Of course, it gets more interesting to find common k-paths or explore how paths vary among individuals. The naive approach for 2-paths would be to create N additional nodes that correspond to nodes when visited in the middle of the 2-path. For example, if you have nodes a_1, ..., a_N you would create nodes b_1, ..., b_N. The aggregate network might have an edge (a_3, b_5, 10) and an edge (b_5, a_7, 10); this would represent the two-path (a_3, b_5, a_7) occurring 10 times. The task you're interested in corresponds to finding those two-paths with large weights.
Both the igraph
and network
packages would suffice for this sort of analysis.
If you have some bound on k
(ie. only 6-paths occur in your dataset), I might also suggest enumerating all the paths that are taken and computing the histogram of each unique path. I don't know of any functions that do this automagically for you.
Upvotes: 1