Reputation: 204
I have a large data set that I want to represent with a network graph using igraph. I just don't understand how to get the colors right. My data is in this format:
df <- data.frame(name = c("john", "john", "john", "linda", "linda", "daniel"), answer = c("linda", "sam", "anna", "john", "sam", "anna"), location = c("#000000", "#000000", "#343434", "#000000", "#000000", "#343434"), group = c("#00FF00", "#00FF00", "#00FF00", "#FF0000", "#FF0000", "#FF0000"))
+--------+--------+----------+---------+
| name | answer | location | group |
+--------+--------+----------+---------+
| john | linda | #000000 | #00FF00 |
| john | sam | #000000 | #00FF00 |
| john | anna | #343434 | #00FF00 |
| linda | john | #000000 | #FF0000 |
| linda | sam | #000000 | #FF0000 |
| daniel | anna | #343434 | #FF0000 |
+--------+--------+----------+---------+
This represents the results of an interview. Everyone got the same question, and then had to give the answer to that question in the form of a name (or multiple names). So John answered "linda, sam, and anna", linda answered "john and sam" and so on.
Now I would like to represent these results color coded in a network graph. The color in the column "group" is the color of the vertex of each person (so john is green and linda and daniel are both red). The color in the column "location" is the color of the arrow that goes from the vertex of "name" to the vertex of "answer". For example:
Here the arrows are right, but the colors are wrong. The two arrows between john and linda are supposed to be the same color. The vertice of john is supposed to be green, and the vertices of linda and daniel are supposed to be red. for sam and anna i have not set a color (how would i do that?)
My code so far is:
g <- graph.data.frame(df)
V(g)[df$answer]$color <- df$location
V(g)[df$name]$color <- df$group
plot(g, vertex.color = V(g)[df$name]$color, edge.color = V(g)[df$answer]$color)
Upvotes: 2
Views: 3211
Reputation: 538
Here is a working solution:
# Load the igraph library
library(igraph)
# Create a simple network
df <- data.frame(name = c("john", "john", "john", "linda", "linda", "daniel"),
answer = c("linda", "sam", "anna", "john", "sam", "anna"),
location = c("#000000", "#000000", "#343434", "#000000", "#000000", "#343434"),
group = c("#00FF00", "#00FF00", "#00FF00", "#FF0000", "#FF0000", "#FF0000"),
stringsAsFactors=FALSE)
# Build a network graph
graph <- graph.data.frame(df)
# Assign colours to vertices
V(graph)$colour <- sapply(V(graph)$name,
function(x, df){
return(df[which(df$name == x)[1], "group"])
}, df)
# Assign colours to the edges
E(graph)$colour <- df$location
# Plot the graph
plot(g, vertex.color=V(graph)$colour, edge.color=E(graph)$colour)
The important things to note in the above is the stringsAsFactors=FALSE
and how the colours of the vertices and edges are assigned.
Upvotes: 3
Reputation: 809
maybe I am overcomplicating it but this code seems to be what you're looking for:
df <- data.frame(name = c("john", "john", "john", "linda", "linda", "daniel"), answer = c("linda", "sam", "anna", "john", "sam", "anna"), location = c("pink", "pink", "red", "pink", "pink", "red"), group = c("yellow", "yellow", "yellow", "blue", "blue", "blue"))
g <- graph.data.frame(df)
#assign to each edge its colour. this works since all the rows in your
#dataframe represent an edge in the resulting graph
E(g)$color <- as.character(df$location)
#then loop through the number of nodes in the graph
for (vrt in 1:length(V(g))){
#since the names in the first column are only a part of all the nodes check if it belongs to that sublist
if(V(g)$name[vrt] %in% df$name) {
#then find the first occurrence of that name in the list and get its related color
#assign it to that node
V(g)$color[vrt] <- as.character(df$group[which(df$name==V(g)$name[vrt])[1]])
}
#otherwise the node will be white (e.g. for anna and sam)
else {
V(g)$color[vrt] <- "white"
}
}
#eventually plot it
plot(g, vertex.color = V(g)$color, edge.color = E(g)$color)
EDIT: I did not used your exact colour-coding!
Upvotes: 4