Reputation: 63
I am trying to use the D3 Bubble Chart in R to make my own bubbles chart with grouped bubble colours.
I have upload the index.html and the flare.json files from the D3 into R and it produced the bubble chart when run. But I didn't wanted to manually change this JSON code to create my own bubbles and groups (header below shows a set of 3 bubble groups with names for the different groups).
{
"name": "flare",
"children": [
{
"name": "analytics",
"children": [
{
"name": "cluster",
"children": [
{"name": "AgglomerativeCluster", "size": 3938},
{"name": "CommunityStructure", "size": 3812},
{"name": "HierarchicalCluster", "size": 6714},
{"name": "MergeEdge", "size": 743}
]
},
{
"name": "graph",
"children": [
{"name": "BetweennessCentrality", "size": 3534},
{"name": "LinkDistance", "size": 5731},
{"name": "MaxFlowMinCut", "size": 7840},
{"name": "ShortestPaths", "size": 5914},
{"name": "SpanningTree", "size": 3416}
]
},
{
"name": "optimization",
"children": [
{"name": "AspectRatioBanker", "size": 7074}
]
}
]
Using the jsonlite package (which from reading online can handle more complex json structures) I have converted to a dataframe.
library(jsonlite)
fromJSON("flare.json",simplifyDateframe = FALSE)
This is without the dataframe structure requested (example).
$children[[10]]$children[[6]]$children[[10]]
$children[[10]]$children[[6]]$children[[10]]$name
[1] "OperatorSwitch"
$children[[10]]$children[[6]]$children[[10]]$size
[1] 2581
This is with the dataframe structure requested (example).
fromJSON("flare.json",simplifyDataFrame = TRUE)
However it produces a long concatenated list of data which I have been trying to untangle to automate with my data.
Arrays, Colors, Dates, Displays, Filter, Geometry, heap, IEvaluable, IPredicate, IValueProxy, math, Maths, Orientation, palette, Property, Shapes, Sort, Stats, Strings, 8258, 10001, 8217, 12555, 2324, 10993, NA, 335, 383, 874, NA, 17705, 1486, NA, 5559, 19118, 6887, 6557, 22026, FibonacciHeap, HeapNode, 9354, 1233, DenseMatrix, IMatrix, SparseMatrix, 3165, 2815, 3366, ColorPalette, Palette, ShapePalette, SizePalette, 6367, 1229, 2059, 2291
Suggested solutions ...
FOR LOOPS (Time-restricted)
I have thought about writing multiple for loops to re-construct the JSON nest structure (which I am stronger at but I have a deadline and this may take a while). But I thought that someone who is more JSON savy might be able to help.
CSV CONVERTED FORMAT (doesn't work)
I also attempted to converted the flare.json file using JSON to CSV convertor to produce the CSV format needed to test whether I could update the content from the CSV directly to R but that didn't work (even with the addition of the flare.json header content that isn't automate from the jsonlite toJSON).
A solution for converting the flare.json from JSON into a dataframe or table so I can upload my data with names, sizes and groups to convert back to JSON to produce my own bubble chart?
If possible it would be great to achieve this all in R, which I don't think is impossible but am happy to hear other suggestions.
I am quite stumped as what to do next. I normally deal with matrices in R so dealing with JSON lists and array is not my strong point.
Upvotes: 3
Views: 2066
Reputation: 6579
This might provide us something else to think about. I'll put comments inline in the code. You can see a live example.
library(jsonlite)
library(dplyr)
flare_json <- rjson::fromJSON( ## rjson just works better on these for me
file = "http://bl.ocks.org/mbostock/raw/4063269/flare.json"
)
# let's have a look at the structure of flare.json
# listviewer htmlwidget might help us see what is happening
# devtools::install_github("timelyportfolio/listviewer")
# library(listviewer)
jsonedit(
paste0(
readLines("http://bl.ocks.org/mbostock/raw/4063269/flare.json")
,collapse=""
)
)
# the interesting thing about Mike Bostock's Bubble Chart example
# though is that the example removes the nested hierarchy
# with a JavaScript function called classes
#// Returns a flattened hierarchy containing all leaf nodes under the root.
#function classes(root) {
# var classes = [];
#
# function recurse(name, node) {
# if (node.children) node.children.forEach(function(child) { recurse(node.name, child); });
# else classes.push({packageName: name, className: node.name, value: node.size});
# }
#
# recurse(null, root);
# return {children: classes};
#}
# let's try to recreate this in R
classes <- function(root){
classes <- data.frame()
haschild <- function(node){
(!is.null(node) && "children" %in% names(node))
}
recurse <- function(name,node){
if(haschild(node)){
lapply(
1:length(node$children)
,function(n){
recurse(node$name,node$children[[n]])
}
)
} else {
classes <<- bind_rows(
classes,
data.frame(
"packageName"= name
,"className" = node[["name"]]
,"size" = node[["size"]]
,stringsAsFactors = F
)
)
}
}
recurse(root$name,root)
return(classes)
}
# now with a R flavor our class replica should work
flare_df <- classes(flare_json)
# so the example uses a data.frame with columns
# packageName, className, size
# and feeds that to bubble.nodes where bubble = d3.layout.pack
# fortunately Joe Cheng has already made a htmlwidget called bubbles
# https://github.com/jcheng5/bubbles
# that will produce a d3.layout.pack bubble chart
library(scales)
bubbles(
flare_df$size
,flare_df$className
,color = col_factor(
RColorBrewer::brewer.pal(9,"Set1")
,factor(flare_df$packageName)
)(flare_df$packageName)
,height = 600
,width = 960
)
# it's not perfect with things such as text sizing
# but it's a start
If you still think you want a nested d3 JSON hierarchy, here is some code.
# convert this to nested d3 json format
# this is example data provided in a comment to this post
df <- data.frame(
"overallgroup" = "Online"
,"primarygroup" = c(rep("Social Media",3),rep("Web",2))
,"datasource" = c("Facebook","Twitter","Youtube","Website","Secondary Website")
,"size" = c(10000,5000,200,10000,2500)
,stringsAsFactors = FALSE
)
# recommend using data.tree to ease our pain here
#devtools::install_github("gluc/data.tree")
library(data.tree)
# the much easier way
df$pathString <- apply(df[,1:3],MARGIN=1, function(x){paste0(x,collapse="/")})
root <- as.Node(df[,4:5])
# the harder manual way
root <- Node$new("root")
sapply(unique(df[,1]),root$AddChild)
apply(
df[,1:ncol(df)]
,MARGIN = 1
,function(row){
lapply(2:length(row),function(cellnum){
cell <- row[cellnum]
if( cellnum < ncol(df) ){ # assume last column is attribute
parent <- Reduce(function(x,y){x$Climb(y)},as.character(row[1:(cellnum-1)]),root)
if(is.null(parent$Climb(cell))){
cellnode <- parent$AddChild( cell )
}
} else{
cellnode <- Reduce(function(x,y){x$Climb(y)},as.character(row[1:(cellnum-1)]),root)
cellnode$Set( size = as.numeric(cell) )
}
})
}
)
# now we should be able to supply root to networkD3
# that expects a typical d3 nested JSON
#devtools::install_github("christophergandrud/networkD3")
library(networkD3)
treeNetwork( root$ToList(unname=TRUE) )
# or to get it in JSON
jsonlite::toJSON( root$ToList(unname=TRUE), auto_unbox=TRUE)
Upvotes: 5
Reputation: 1244
Thanks to @timelyportfolio for pointing me to this. You can achieve conversion from and to data.frame / json quite simply with the data.tree package (latest from github required). The trick is to paste together a path:
#devtools::install_github("gluc/data.tree")
libraray(data.tree)
df <- data.frame(
"overallgroup" = "Online"
,"primarygroup" = c(rep("Social Media",3),rep("Web",2))
,"datasource" = c("Facebook","Twitter","Youtube","Website","Secondary Website")
,"size" = c(10000,5000,200,10000,2500)
,stringsAsFactors = FALSE
)
df$pathString <- paste("root", df$overallgroup, df$primarygroup, df$datasource, sep="/")
root <- as.Node(df[,-c(1, 2, 3)])
# now we should be able to supply root to networkD3
# that expects a typical d3 nested JSON
#devtools::install_github("christophergandrud/networkD3")
library(networkD3)
treeNetwork( root$ToList(unname=TRUE) )
# or to get it in JSON
jsonlite::toJSON( root$ToList(unname=TRUE), auto_unbox=TRUE)
Upvotes: 1
Reputation: 78792
Posting this only for further discussion. As @timelyportfolio said, there's quite a bit to consider. Here's one path (only going from "flare" JSON to a long data frame for now until we get more of what you're looking for).
library(jsonlite)
library(dplyr)
library(tidyr)
flare <- fromJSON("http://bl.ocks.org/mbostock/raw/4063269/flare.json",
simplifyVector=FALSE)
flare_df <- bind_rows(lapply(flare$children,
function(x) {
kids <- as.list(x)
kids$stringsAsFactors=FALSE # prevents bind_rows warnings
do.call("data.frame", kids)
}
)) %>% gather(child_path, value, -name)
set.seed(1492) # results reproducibility
print(flare_df[sample(nrow(flare_df), 50),])
## Source: local data frame [50 x 3]
##
## name child_path value
## 1 display children.name.18 NA
## 2 util children.size.11 5559
## 3 display children.name.9 NA
## 4 display children.children.size.9 NA
## 5 physics children.children.name.4 NA
## 6 query children.children.name add
## 7 physics children.children.children.size.22 NA
## 8 data children.name.20 NA
## 9 vis children.children.size.20 19382
## 10 flex children.children.name.36 NA
## .. ... ... ...
# just showing the top-level nodes are present for an example
select(flare_df, name) %>% arrange(name) %>% distinct %>% print(n=1000)
## Source: local data frame [10 x 1]
##
## name
## 1 analytics
## 2 animate
## 3 data
## 4 display
## 5 flex
## 6 physics
## 7 query
## 8 scale
## 9 util
## 10 vis
Unwrapping that for data frame to "flare" is pretty straightforward, but that may not be a usable data frame format for your manipulation.
Upvotes: 1