ASH
ASH

Reputation: 20302

How can we format numbers in a Sankey chart and set labels outside of the chart?

I've got some simple code that produces a nice Sankey chart.

import holoviews as hv
import plotly.graph_objects as go
import plotly.express as pex
hv.extension('bokeh')


sankey1 = hv.Sankey(df_final, kdims=['Sub_Market', 'Sport League'], vdims=["Revenue"])
hv.Sankey(sankey1)

sankey1.opts(cmap='Colorblind',label_position='right',
                                 edge_color='Sub_Market', edge_line_width=0,
                                 node_alpha=1.0, node_width=40, node_sort=True,
                                 width=800, height=600, bgcolor="snow",
                                 title="Flow of Revenue between Sub Market and Conference")

enter image description here

Unfortunately, the numbers are coming through as exponential. I really want to get them displayed in millions. Also, is there a way to get the labels on the right displayed on the right and at the same time, get the labels on the left displayed on the left, so they are all outside the chart and easier to read?

Upvotes: 0

Views: 1493

Answers (2)

Oluwafemi Sule
Oluwafemi Sule

Reputation: 38952

First, holoview allows the configuration of custom formatters for dimensions.

To render the numbers as-is, you can use str function as a formatter for the dimension.

I have used a sample dataframe to show an example of how this can be achieved. You can run it in this runnable collab notebook.

import holoviews as hv
from holoviews.core import Store
import pandas as pd

hv.ipython.notebook_extension('bokeh')

Store.set_current_backend('bokeh')
renderer = Store.renderers['bokeh']

df_final = pd.DataFrame({
    'Sub_Market': ['Central texas', 'Southern California', 'Florida'],
    'Sport League': ['MLS', 'NBA', 'MLS'],
    'Revenue': [1.4981211 * 10**5, 2.921212* 10**6, 1.2121112*10**6]
})

graph = hv.Sankey(
    df_final, 
    kdims=['Sub_Market', 'Sport League'],
    vdims=[hv.Dimension("Revenue", value_format=str)],
)

Now to customise the position of the labels, you need the rendered plot.

Here we are using bokeh as a backend and can get the plot by forwarding the graph object as an argument to the get_plot method of the bokeh renderer.

renderer = Store.renderers['bokeh']
plot = renderer.get_plot(graph)

Now, we can access the plot handles that we wish to customize. The default x_offset value applied on all labels is 0. We only need to apply offsets on the left labels.

To do so we augment the datasource for the labels to include a 'x_offset' field and set the offset for the labels that we wish to position in the left side of the quads.

Also, we need to set the starting point of the plot.xrange so that the plot is not cutoff.

offset = -200
num_nodes = len(plot.handles['text_1_source'].data['x'])
plot.handles['text_1_source'].data['x_offset'] = [0]* num_nodes
num_left_nodes = 3
left_nodes_selection = slice(0, num_left_nodes)
plot.handles['text_1_source'].data['x_offset'][left_nodes_selection] = [offset]* num_left_nodes
plot.handles['text_1_glyph'].x_offset = {'field': 'x_offset' }
plot.handles['plot'].x_range.start += (2*offset)

Finally, we can render the plot to an SVG component and display it in the notebook.

hv.ipython.notebook_extension('bokeh')
data, metadata = hv.ipython.display_hooks.render(plot, fmt='svg')
hv.ipython.display(hv.ipython.HTML(data["text/html"]))

sankey plot with customised label positions

Upvotes: 1

mosc9575
mosc9575

Reputation: 6337

The solution below is working for holoviews and is (probably) not valid for plotly.

In holoviews you can add hv.Dimension(spec, **params), which gives you the opportunity to apply a formatter with the keyword value_format to a column name. This formatter can be predefiend or defiend created. The example below shows how to define a simple formatter by a custom python function.

Example Code

import holoviews as hv
import pandas as pd

data = {'A':['XX','XY','YY','XY','XX','XX'],
        'B':['RR','KK','KK','RR','RK','KK'],
        'values':[1e6,5e5,8e4,15e3,19e2,1],
       }

df = pd.DataFrame(data)

def fmt(tick):
    if tick < 1e3:
        unit = ''
        num =  round(tick,2)
    elif tick < 1e6:
        unit = 'k'
        num =  round(tick/1e3,2)
    else:
        unit = 'm'
        num =  round(tick/1e6,2)
    return f'{num} {unit}'


hv.Sankey(df, vdims = hv.Dimension('values', value_format=fmt))

Output

Sanky with formatted values

Upvotes: 3

Related Questions