Reputation: 136
I have pandas dataframe like this:
LEVEL_1 LEVEL_2 Freq Percentage
0 HIGH HIGH 8842 17.684
1 AVERAGE LOW 2802 5.604
2 LOW LOW 22198 44.396
3 AVERAGE AVERAGE 6804 13.608
4 LOW AVERAGE 2030 4.060
5 HIGH AVERAGE 3666 7.332
6 AVERAGE HIGH 2887 5.774
7 LOW HIGH 771 1.542
I can get tiles of LEVEL_1 and LEVEL_2:
from statsmodels.graphics.mosaicplot import mosaic
mosaic(df, ['LEVEL_1','LEVEL_2'])
enter image description here
I just want to put Freq and Percentage at the center of each tile of mosaic plot.
How can I do this?
Upvotes: 1
Views: 3403
Reputation: 454
Here's a start. Note I had to add a row of zeros to the DataFrame for the labeling. You can make the labeling nicer by string formatting in the lambda
function. You'll also want to reorder the headers.
import pandas as pd
from statsmodels.graphics.mosaicplot import mosaic
import io
d = io.StringIO()
d.write(""" LEVEL_1 LEVEL_2 Freq Percentage\n
HIGH HIGH 8842 17.684\n
AVERAGE LOW 2802 5.604\n
LOW LOW 22198 44.396\n
AVERAGE AVERAGE 6804 13.608\n
LOW AVERAGE 2030 4.060\n
HIGH AVERAGE 3666 7.332\n
AVERAGE HIGH 2887 5.774\n
LOW HIGH 771 1.542""")
d.seek(0)
df = pd.read_csv(d, skipinitialspace=True, delim_whitespace=True)
df = df.append({'LEVEL_1': 'HIGH', 'LEVEL_2': 'LOW', 'Freq': 0, 'Percentage': 0}, ignore_index=True)
df = df.sort_values(['LEVEL_1', 'LEVEL_2'])
df = df.set_index(['LEVEL_1', 'LEVEL_2'])
print(df)
mosaic(df['Freq'], labelizer=lambda k: df.loc[k].values);
Upvotes: 4