Reputation: 97
I'm trying to take cell spatial data and make a sunburst chart out of the slide data. Here's the basic format of the dataframe I'm using for it.
structure(list(slide = c("LU095", "LU095", "LU095", "LU095",
"LU095", "LU095", "LU095", "LU095", "LU095", "LU095", "LU095",
"LU095", "LU095", "LU095", "LU095", "LU095", "LU095", "LU095",
"LU095", "LU095", "LU095", "LU095", "LU095", "LU095", "LU095",
"LU095", "LU095", "LU095", "LU095", "LU095", "LU095", "LU095",
"LU095", "LU095", "LU095", "LU095", "LU095", "LU095", "LU095",
"LU095", "LU095", "LU095", "LU095", "LU095", "LU095", "LU095"
), stroma_bins = structure(c(1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L,
5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 10L,
10L, 10L, 10L, 10L, 10L, 10L, 10L), levels = c("0-10% Stroma",
"10-20% Stroma", "20-30% Stroma", "30-40% Stroma", "40-50% Stroma",
"50-60% Stroma", "60-70% Stroma", "70-80% Stroma", "80-90% Stroma",
"90-100% Stroma"), class = "factor"), cd8_percent_bins = structure(c(1L,
1L, 3L, 1L, 2L, 1L, 2L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L,
3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L,
4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L), levels = c("0-2% CD8+ Cells",
"2-4% CD8+ Cells", "4-6% CD8+ Cells", "6-8% CD8+ Cells", "8-10% CD8+ Cells",
"10-15% CD8+ Cells", "15-20% CD8+ Cells", ">20% CD8+ Cells"), class = "factor"),
Freq = c(8L, 5L, 1L, 7L, 1L, 7L, 2L, 15L, 4L, 4L, 2L, 15L,
4L, 3L, 2L, 12L, 15L, 1L, 4L, 2L, 1L, 1L, 16L, 12L, 8L, 8L,
4L, 1L, 3L, 1L, 14L, 4L, 17L, 6L, 9L, 11L, 5L, 2L, 51L, 18L,
24L, 24L, 17L, 32L, 21L, 11L)), row.names = c(NA, -46L), class = c("data.table",
"data.frame"))
I'm using Plotly in R, but for some reason it's only displaying the outermost layer of the Sunburst chart for one region.
Here's the code I have for it so far.
fig <- plot_ly(
labels = df2$labels,
parents = df2$parents,
values = df2$values,
type = 'sunburst',
branchvalues = 'total')
fig
Upvotes: 1
Views: 632
Reputation: 18734
What you did to aggregate your data to create the data set in your plot isn't in your question. However, I see that you have 3 levels, and you didn't use the argument ids
. You don't have unique children, for Plotly to interpret, either.
Starting with the data from the dput
output.
For the root or top-level
All of the data in slide
is the same, but I wrote it this way to make it more dynamic. This returns one row because there is one unique value in the highest level.
d1 <- df2 %>% group_by(slide) %>%
summarise(values = sum(Freq)) %>%
mutate(ids = slide, parents = "") %>%
rename(labels = slide) %>%
select(ids, parents, labels, values) # all frame same order
# # A tibble: 1 × 4
# ids labels values parents
# <chr> <chr> <int> <chr>
# 1 LU095 LU095 435 ""
The next level, mid-level or first-child level
I'll take the same exact approach, but instead of leading with slide
, I'll lead with stroma_bins
. Additionally, the ids
will contain the parent and current level.
d2 <- df2 %>% group_by(stroma_bins) %>%
summarise(values = sum(Freq)) %>%
mutate(ids = paste0(stroma_bins, " - ", unique(df2$slide)),
parents = unique(df2$slide)) %>%
rename(labels = stroma_bins) %>%
select(ids, parents, labels, values)
# # A tibble: 10 × 4
# ids parents labels values
# <chr> <chr> <fct> <int>
# 1 0-10% Stroma - LU095 LU095 0-10% Stroma 8
# 2 10-20% Stroma - LU095 LU095 10-20% Stroma 6
# 3 20-30% Stroma - LU095 LU095 20-30% Stroma 8
# 4 30-40% Stroma - LU095 LU095 30-40% Stroma 9
# 5 40-50% Stroma - LU095 LU095 40-50% Stroma 25
# 6 50-60% Stroma - LU095 LU095 50-60% Stroma 24
# 7 60-70% Stroma - LU095 LU095 60-70% Stroma 36
# 8 70-80% Stroma - LU095 LU095 70-80% Stroma 53
# 9 80-90% Stroma - LU095 LU095 80-90% Stroma 68
# 10 90-100% Stroma - LU095 LU095 90-100% Stroma 198
The next level has two parents, therefore both parents will be included. It follows the same premise as the last two, but in the parents
column, we need to combine the parents. (I only included a sample of what this frame looks like.)
d3 <- df2 %>%
rename(labels = cd8_percent_bins,
values = Freq) %>%
mutate(ids = paste0(labels, " - ", stroma_bins),
parents = paste0(stroma_bins, " - ", unique(df2$slide))) %>%
select(ids, parents, labels, values)
# ids parents labels values
# 1: 0-2% CD8+ Cells - 0-10% Stroma 0-10% Stroma - LU095 0-2% CD8+ Cells 8
# 2: 0-2% CD8+ Cells - 10-20% Stroma 10-20% Stroma - LU095 0-2% CD8+ Cells 5
# 3: 4-6% CD8+ Cells - 10-20% Stroma 10-20% Stroma - LU095 4-6% CD8+ Cells 1
# 4: 0-2% CD8+ Cells - 20-30% Stroma 20-30% Stroma - LU095 0-2% CD8+ Cells 7
# 5: 2-4% CD8+ Cells - 20-30% Stroma 20-30% Stroma - LU095 2-4% CD8+ Cells 1
# 6: 0-2% CD8+ Cells - 30-40% Stroma 30-40% Stroma - LU095 0-2% CD8+ Cells 7
# 7: 2-4% CD8+ Cells - 30-40% Stroma 30-40% Stroma - LU095 2-4% CD8+ Cells 2
# 8: 0-2% CD8+ Cells - 40-50% Stroma 40-50% Stroma - LU095 0-2% CD8+ Cells 15
# 9: 2-4% CD8+ Cells - 40-50% Stroma 40-50% Stroma - LU095 2-4% CD8+ Cells 4
# 10: 4-6% CD8+ Cells - 40-50% Stroma 40-50% Stroma - LU095 4-6% CD8+ Cells 4
Next, combine these three data frames into one data frame.
dd <- do.call(rbind, list(d1, d2, d3))
Now the data is ready.
plot_ly(dd, parents = ~parents, labels = ~labels, values = ~values,
ids = ~ids, branchvalues = "total", type = "sunburst")
Upvotes: 2