Reputation: 67
Looking for some help on joining in order to make a forceNetwork()
graph using networkd3. I just can't figure out what's wrong with the code below as I'm getting the following error/warning message.
I used this code before and I got it to work back then - just not sure what's different this time as I feel the input file is the same.
Warning messages:
1: Column `src`/`name` joining factors with different levels, coercing to character vector
2: Column `target`/`name` joining factors with different levels, coercing to character vector
# Load package
library(networkD3)
library(dplyr)
# Create data
src <- c(all_artists$from)
target <- c(all_artists$to)
networkData <- data.frame(src, target, stringsAsFactors = TRUE)
networkData
nodes <- data.frame(name = unique(c(src, target)), size = all_artists$related_artist_followers, stringsAsFactors = TRUE)
nodes$id <- 0:(nrow(nodes) - 1)
nodes
width <- c(all_artists$related_artist_followers)
width
# create a data frame of the edges that uses id 0:9 instead of their names
edges <- networkData %>%
left_join(nodes, by = c("src" = "name")) %>%
select(-src) %>%
rename(source = id) %>%
left_join(nodes, by = c("target" = "name")) %>%
select(-target) %>%
rename(target = id)
The dataset shows the artists that are related to each other - from is the nodes and to is the edges.
from to artist_popularity
Jay-Z Kanye West 80
Jay-Z P. Diddy 60
Kanye West Kid Cudi 40
Upvotes: 0
Views: 178
Reputation: 8848
The line where you build the nodes
data frame seems unlikely to work as expected because there's no connection between the length of unique(c(src, target))
and all_artists$related_artist_followers
. You could count the number of times a node/name appears in the networkData$src
or all_artists$from
column with...
nodes$size <- sapply(nodes$name, function(name) sum(networkData$src %in% name))
Once you have the nodes
data frame created, it's easy to convert the names in the networkData
data frame to zero-indexed indices with...
networkData$src <- match(networkData$src, nodes$name) - 1
networkData$target <- match(networkData$target, nodes$name) - 1
Note that it is also mandatory to provide a Value
parameter for the Links
data frame and a Group
parameter for the Nodes
data frame (any parameter that does not have a default value in the help file is mandatory, otherwise you might get an error or unexpected behavior... that goes for all R functions, not just networkd3). You can create columns in your data frames for them like this...
networkData$value <- 1
nodes$group <- 1
So all together in a reproducible example, you might have...
from <- c("Jay-Z", "Jay-Z", "Kanye West")
to <- c("Kanye West", "P. Diddy", "Kid Cudi")
artist_popularity <- c(80, 60, 40)
all_artists <- data.frame(from, to, artist_popularity, stringsAsFactors = FALSE)
networkData <- data.frame(src = all_artists$from, target = all_artists$to,
stringsAsFactors = FALSE)
nodes <- data.frame(name = unique(c(networkData$src, networkData$target)),
stringsAsFactors = FALSE)
nodes$size <- sapply(nodes$name, function(name) sum(networkData$src %in% name))
networkData$src <- match(networkData$src, nodes$name) - 1
networkData$target <- match(networkData$target, nodes$name) - 1
networkData$value <- 1
nodes$group <- 1
library(networkD3)
forceNetwork(Links = networkData, Nodes = nodes, Source = "src",
Target = "target", Value = "value", NodeID = "name",
Nodesize = "size", Group = "group", opacityNoHover = 1)
Upvotes: 1