Reputation: 5088
Let's assume I have dataframe that looks like this:
row id event actor time
1 1 push dude 1
2 1 comment guy 2
3 1 comment guy 3
4 2 request person 1
5 2 comment person 2
6 2 merge dude 2
7 3 comment guy 3
8 3 comment dude 4
9 3 reject person 5
Now, assume that I want to turn this into a graph (edge list), using the following rule: a directed edge is created from an actor on row n to the actor on row n+1, if they share the same id. E.g.
dude -> guy
(for id 1).guy -> person
(while guy appears on row 3, and person on row 4, they have different ids).Hence, I would end up with a graph that looks like this
from to time
dude guy 1-2
guy guy 2-3
person person 1-2
person dude 2
guy dude 3-4
dude person 4-5
How would I start to approach this problem in R code? I'm lost at even how I would start doing this. This would be useful, because it would help to construct social networks based on event workflow data.
In terms of pseudocode I think it would be something like this:
for each rows n and n+1
if row n "id" = row n+1 "id"
store "actor" from row n in column "from"
store "actor" from row n+1 in column "to"
store "time" from row n in column "time"
unless "time" row n = "time" row n+1
append "time" from row n+1 in column "time"
else
move to next row
end
Upvotes: 1
Views: 1546
Reputation: 66819
Here's a data.table way:
# make an edge list (pairs of nodes) with attributes
require(data.table)
DT <- data.table(DF)
gdt <- DT[,{
nodes <- actor # not unique(actor), strangely
list(
n1=head(nodes,-1),
n2=tail(nodes,-1),
t1=head(time,-1),
t2=tail(time,-1)
)},by=id]
# do annoying string processing
gdt[,
time:=do.call(paste,c(unique(c(t1,t2)),list(sep='-'))),
,by=1:nrow(gdt)][,
c('id','t1','t2'):=NULL
]
which gives
n1 n2 time
1: dude guy 1-2
2: guy guy 2-3
3: person person 1-2
4: person dude 2
5: guy dude 3-4
6: dude person 4-5
And then make a graph
require(igraph)
g <- graph.data.frame(gdt)
Upvotes: 1
Reputation: 55685
Here is a quick way to do this. I am not sure how robust it would be.
library(plyr)
dat2 <- ddply(dat, .(id), function(d){
data.frame(
event = d$event[-1],
from = d$actor[-NROW(d)],
to = d$actor[-1],
time = paste(d$time[-NROW(d)], d$time[-1], sep = "-")
)
})
Upvotes: 2