Ant
Ant

Reputation: 343

R: networkD3 sankey plot - colours not displaying

I am using the package in to produce sankey plots. I have used the following code to produce a working plot:

sankeyNetwork(Links = df_links, Nodes = df_nodes, Source = "source", 
              Target = "target", Value = "value", NodeID = "name", 
              units = "Cases", fontSize = 12, nodeWidth = 20)

However, all the nodes are coloured blue, where I thought the default output of the package was to colour each node separately.

When I try to add a NodeGroup function, which the documentation says should allow colour to be displayed, all the node labels disappear and all the nodes are now coloured black.

sankeyNetwork(Links = df_links, Nodes = df_nodes, Source = "source", 
              Target = "target", Value = "value", NodeID = "name", 
              NodeGroup = "ID", units = "Cases", fontSize = 12, 
              nodeWidth = 20)

If anyone could let me know where I am going wrong that would be much appreciated. I suspect I am misunderstanding the usage of the NodeGroup variable - I am currently assigning each of the seven NodeIDs a unique group c(0,1,2,3,4,5,6,7) with the initial aim of having each node coloured differently. Is this the correct interpretation of the variable?

Upvotes: 2

Views: 1116

Answers (1)

CJ Yetman
CJ Yetman

Reputation: 8848

The problem you're having must be with the data you are using, but I can't tell you precisely what because you haven't shared it. If I use properly structured data and use it as the inputs, your first, exact sankeyNetwork() command works as expected (with colors)...

(see further below for discussion of the NodeGroup parameter)

library(networkD3)

URL <- paste0('https://cdn.rawgit.com/christophergandrud/networkD3/',
              'master/JSONdata/energy.json')
energy <- jsonlite::fromJSON(URL)

df_links <- energy$links
df_nodes <- energy$nodes

sankeyNetwork(Links = df_links, Nodes = df_nodes, Source = "source", 
              Target = "target", Value = "value", NodeID = "name", 
              units = "Cases", fontSize = 12, nodeWidth = 20)

enter image description here

If you check the help file, the NodeGroup parameter is described as "character string specifying the node groups in the Nodes. Used to color the nodes in the network." If you're specifying NodeGroup as c(0,1,2,3,4,5,6,7), that's not a character string. That is likely why all the nodes are black using your second sankeyNetwork() command. For example, see this question about coloring groups with sankeyNetwork.

Additionally, at the top of the help file in the "Usage" section, you can see that the default value for NodeGroup is whatever is passed into NodeID. So if you do not assign anything to NodeGroup, as in your first example, then the NodeID will be used as the group... effectively creating a unique group for each node, which will be assigned a color according to the colourScale parameter.

One thing that is not explicit in the help file is that it appears that only the first word of the group name is used, so in the image above, for example, the "Oil imports", "Oil reserves", and "Oil" nodes are all considered part of the same group and therefore have the same color.

Upvotes: 0

Related Questions