histelheim
histelheim

Reputation: 5088

Turning a dataframe into a graph in R

Let's assume I have dataframe that looks like this:

 row    id    event    actor     time
 1      1     push     dude      1
 2      1     comment  guy       2   
 3      1     comment  guy       3
 4      2     request  person    1
 5      2     comment  person    2
 6      2     merge    dude      2
 7      3     comment  guy       3
 8      3     comment  dude      4
 9      3     reject   person    5

Now, assume that I want to turn this into a graph (edge list), using the following rule: a directed edge is created from an actor on row n to the actor on row n+1, if they share the same id. E.g.

Hence, I would end up with a graph that looks like this

from     to        time
dude     guy       1-2
guy      guy       2-3   
person   person    1-2
person   dude      2
guy      dude      3-4
dude     person    4-5

How would I start to approach this problem in R code? I'm lost at even how I would start doing this. This would be useful, because it would help to construct social networks based on event workflow data.

In terms of pseudocode I think it would be something like this:

for each rows n and n+1
   if row n "id" = row n+1 "id"
     store "actor" from row n in column "from"
     store "actor" from row n+1 in column "to"
     store "time" from row n in column "time"
     unless "time" row n = "time" row n+1 
       append "time" from row n+1 in column "time"
   else
     move to next row
end

Upvotes: 1

Views: 1546

Answers (2)

Frank
Frank

Reputation: 66819

Here's a data.table way:

# make an edge list (pairs of nodes) with attributes
require(data.table)
DT <- data.table(DF)
gdt <- DT[,{
  nodes <- actor # not unique(actor), strangely
  list(
    n1=head(nodes,-1),
    n2=tail(nodes,-1),
    t1=head(time,-1),
    t2=tail(time,-1)
)},by=id]
# do annoying string processing
gdt[,
  time:=do.call(paste,c(unique(c(t1,t2)),list(sep='-'))),
,by=1:nrow(gdt)][,
c('id','t1','t2'):=NULL
]

which gives

       n1     n2 time
1:   dude    guy  1-2
2:    guy    guy  2-3
3: person person  1-2
4: person   dude    2
5:    guy   dude  3-4
6:   dude person  4-5

And then make a graph

require(igraph)
g <- graph.data.frame(gdt)

Upvotes: 1

Ramnath
Ramnath

Reputation: 55685

Here is a quick way to do this. I am not sure how robust it would be.

library(plyr)
dat2 <- ddply(dat, .(id), function(d){
  data.frame(
    event = d$event[-1],
    from = d$actor[-NROW(d)],
    to = d$actor[-1],
    time = paste(d$time[-NROW(d)], d$time[-1], sep = "-")
  )
})

Upvotes: 2

Related Questions