prodoggy4life
prodoggy4life

Reputation: 327

How do I draw this box plot in pandas?

I currently have this dataframe like this:

| patient type                |   asir |    aspr |
|:----------------------------|-------:|--------:|
| definitive dyalisis patient | 2975.6 | 15808.1 |
| kidney transplant patient   |  362   |  4469.3 |

Here patient type is the index

Right now, I'm trying to create a pandas box plot that has 4 box plots in total, 2 of which are the asir and aspr values for definitive dialysis patient and 2 more for kidney transplant patient.

Currently, I've tried to code it out with this following code:

%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# import the csv file
dataname = 'Datasets\\age-standardised-incidence-rate-and-prevalence-rate-for-definitive-dialysis-and-transplant.csv'
data = pd.read_csv(dataname)
df = pd.DataFrame(data)

# drop the year column away as we want together all the definitive dialysis patient data and the kidney transplant patient data together.abs
df2 = df.drop(['year'],axis=1)

# sum up all the data for the respective patien_type according to their respective asir and aspr
df3 = df2.groupby(df2['patient_type']).sum()
df4 = df3.rename(columns = {'patient_type':'patient_type','asir':'asir','aspr':'aspr'})

# pivot the table so that we can use the patient_type to plot out the bar plot
# df4 = df4.pivot_table(columns=['patient_type'])

# plot out the box plot 
bplot = df4.boxplot(by='patient_type',column=['asir','aspr'],grid=True,figsize=(40,20),patch_artist=True,fontsize=20)
plt.title('Boxplot Of Age Standardised incidence rate and prevalence rate for definitive dialysis and transplant patients.',fontsize=20)
# plt.legend(df4['patient_type'],fontsize=20)
plt.show()
df4 

But it ended up looking like this: enter image description here

Upvotes: 1

Views: 151

Answers (1)

rednafi
rednafi

Reputation: 1731

So I'm assuming that you want 4 box plots with each type: asir, aspr, definitive dyalisis patient and kidney transplant patient.

Let's say your initial dataframe looks like this:

df = pd.DataFrame({'asir': [2975.6, 362.0], 'aspr':[15808.1, 4469.3],
    'patient type': ['definitive dyalisis patient',
            'kidney transplant patient']})
| patient type                |   asir |    aspr |
|:----------------------------|-------:|--------:|
| definitive dyalisis patient | 2975.6 | 15808.1 |
| kidney transplant patient   |  362   |  4469.3 |

Draw the first plot where we show asir and the aspr counts:

df = df.set_index('patient type')
df.boxplot(grid=True,figsize=(40,20),patch_artist=True,fontsize=20)

This shows, enter image description here

Now transpose your data frame to plot the definitive dyalisis patient and kidney transplant process.

t_df = df.T

The data frame looks like this now:

|      |   definitive dyalisis patient |   kidney transplant patient |
|:-----|------------------------------:|----------------------------:|
| asir |                        2975.6 |                       362   |
| aspr |                       15808.1 |                      4469.3 |

Now plot this just like before:

t_df.boxplot(grid=True,figsize=(40,20),patch_artist=True,fontsize=20)

The plot looks like this: enter image description here

Upvotes: 1

Related Questions