Reputation: 45
The data(excel) that I have looks like this:
I have 2600 movies in the first column and there are names of directors and staffs in other columns. Some names appear several times.
I'm having trouble converting this excel data to a format that I can perform two-mode network analysis(event: movie, ties from those movies) in r. Are there any helping documents or codes that I can convert this data to a proper format?
Upvotes: 0
Views: 257
Reputation: 926
You can do this using igraph
(which calls this type of network bipartite).
Assume you have created a data frame with your excel data in it, called dt.
dt
Movie director codirector staff1
1 StarWars JJAbrams <NA> Anne
2 Abarter JamesCameron <NA> <NA>
3 Loiter Kenn Klark Kage
Then you can create a bipartite graph, g
as follows:
library(reshape2)
edgelist <- melt(dt, id.vars = 'Movie')[, -2]
edgelist <- edgelist[complete.cases(edgelist), ]
library(igraph)
g <- graph.data.frame(edgelist)
V(g)$type <- V(g)$name %in% edgelist[, 1]
g
plot(g)
IGRAPH DN-B 9 6 --
+ attr: name (v/c), type (v/l)
+ edges (vertex names):
[1] StarWars->JJAbrams Abarter ->JamesCameron Loiter ->Kenn Loiter ->Klark
[5] StarWars->Anne Loiter ->Kage
In igraph
a bipartite graph is a regular graph with each vertex having a type attribute set to TRUE/FALSE
. It doesn't matter which type of vertex is which, in this case Movies are set to TRUE
.
Upvotes: 1