MelaniaCB
MelaniaCB

Reputation: 435

Select top 3 parents (groups) in plotly graph using R

What I want is to plot only 3 of my parents, the ones that spend the highest cost with below coding.

parent <- as.character(c("Sam","Elena","Sam","Jhon","Raul","Sam","Jhon","Sara","Paul","Chris"))
cost <- as.numeric(as.character(c(15000,10000,12000,15000,10000,12000,15000,14000,19000,2000)))
topic <- as.character(c("Banana","Banana","Berries","Apple","Watermelon","Banana","Berries","Avocado","Watermelon","Pinneaple"))

sample <- as.data.frame(cbind(parent,cost,topic))
sample$cost <- as.numeric(as.character(sample$cost))
sample$parent <- as.character(sample$parent)
sample$topic <- as.character(sample$topic)

# Color setting
ramp2 <- colorRamp(c("deepskyblue4", "white"))
ramp.list2 <- rgb( ramp2(seq(0, 1, length = 15)), max = 255)

plot_ly(sample, x = ~parent, y = ~cost, type = 'bar', color = ~topic) %>%
  layout(yaxis = list(title = 'Cost'), xaxis = list(title = 'Parent'), barmode = 'stack', colorway = ramp.list2) %>%
  config(displayModeBar = FALSE)

I tried to use transforms inside plotly function, like this:

transforms = list(
list(
type = 'aggregate',
groups = sample$parent,
aggregations = list(
list(
target = 'x', 
func = 'max', 
enabled = T))
))

But it still gives me the same output and I want to select only 3. Also, tried to use it like this:

transforms = list(
list(
type = 'filter',
target = 'y',
operation = '>',
value = cost[-3:-1]))

But it takes only cost without takin the full cost parent spent on and only gives me 2 parents instead of 3. And finally, it's not using ramp.list2 to select colors.

Upvotes: 0

Views: 433

Answers (1)

Taher A. Ghaleb
Taher A. Ghaleb

Reputation: 5240

According to what I understood, you can use the following code to get the top 3 parents separately, as follows:

top_3 <- sample %>% 
         group_by(parent) %>% 
         summarise(cost = sum(cost)) %>% 
         arrange(-cost) %>% 
         head(3)

This will give you the following:

# A tibble: 3 x 2
#   parent  cost
#   <chr>  <dbl>
# 1 Sam    39000
# 2 Jhon   30000
# 3 Paul   19000

Then, in your plot_ly, you can just refer to these top_3 parents, as follows:

plot_ly(sample[sample$parent %in% top_3$parent,], x = ~parent, y = ~cost, type = 'bar', color = ~topic) %>%
   layout(yaxis = list(title = 'Cost'), xaxis = list(title = 'Parent'), barmode = 'stack', colorway = ramp.list2) %>%
   config(displayModeBar = FALSE)

which will produce the following plot:

plot

Hope it helps.

Upvotes: 1

Related Questions