Reputation: 527
I would like to make a barchart diagramm like this one with any python module that I can interface with matplotlib:
Below is an example data and an explanation of what I can do as for now:
import pandas
from io import StringIO
text="""
Name 1980 1982
A Administration Budget
B Administration Administration
C Administration Administration
D Administration Budget
E Administration Budget
F Administration Administration
G Administration Administration
H Administration Administration
"""
data=pandas.read_fwf(StringIO(text),header=1).set_index("Name")
count=pandas.DataFrame(index=["Administration","Budget"])
for col in data.columns:
count[col]=data[col].value_counts()
count.T.plot(kind="bar",stacked=True)
When I plot count
, I get the following stacked bar chart:
I can also get the number of people who moved between 1980 and 1982 from the Administration department to the Budget department by doing
pandas.crosstab(data["1980"],data["1982"])
which gives:
1982 Administration Budget
1980
Administration 5 3
However I don't know how to draw the flows between each part of the bar chart. Does anyone know how ?
Upvotes: 4
Views: 2959
Reputation: 43
You can use functions of pandas: crosstab and melt for prepare your data for sankey:
from io import StringIO
import pandas as pd
import plotly
import chart_studio.plotly as py
text = """
Name 1980 1982
A Administration Budget
B Administration Administration
C Administration Administration
D Administration Budget
E Administration Budget
F Administration Administration
G Administration Administration
H Administration Administration
"""
data = pd.read_fwf(StringIO(text),header=1)
# Make crosstab
data_cross = pd.crosstab(data['1980'], data['1982'])
print(data_cross)
# Make flat table
data_tidy = data_cross.rename_axis(None, axis=1).reset_index().copy()
# Make tidy table
formatted_data = pd.melt(data_tidy,
['1980'],
var_name='1982',
value_name='Value')
import plotly.graph_objects as go
fig = go.Figure(data=[go.Sankey(
node = dict(
pad = 15,
thickness = 20,
line = dict(color = "black", width = 0.5),
label = ["Administration", "Administration", "Budget"],
color = ['blue', 'blue', 'green']
),
link = dict(
source = [0, 0], # indices correspond to labels...
target = [1, 2],
value = [5, 3],
color = ['lightblue', 'lightgreen']
))])
fig.update_layout(title_text="Basic Sankey Diagram", font_size=10)
fig.show()
Produces the following output:
Upvotes: 2