Reputation: 1733
This is how my data sheet would look like
I will have to filter by domain and then find 'Course1 Completion' status like 1. Percentage of completed 2. Percentage of Not completed 3. Percentage of Not reqd etc
and plot them as nested bar charts using plotly
I tried to use a for loop and tried the above scenario, but some how I am unable to achieve it, I am quite not sure where I am going wrong.
Only the details for 'Domain A' is getting printed not the rest.
My Code :
import pandas as pd
import plotly.graph_objects as go
from plotly.graph_objs import Pie, Layout,Figure
import plotly.offline as py
df = pd.read_excel('data/master.xlsx',sheet_name=1)
ds = df['Course1 completion']
df['Domain'].fillna('No Specific Domain',inplace=True)
df['Course1 completion'].fillna('Not Applicable',inplace=True)
portfolios=df['Domain'].unique().tolist()
print(portfolios)
for portfolio in portfolios:
df=df[df['Domain']==portfolio]
ds=df['Course1 completion']
count = ds.value_counts()
status = ds.value_counts(normalize=True).mul(100).round(2)
print('Status of portfolio : '+portfolio)
print(status)
print()
print('*********************************************')
My output:
Any help regarding this is much appreciated. Thanks !
Upvotes: 0
Views: 310
Reputation: 332
In the statement df=df[df['Domain']==portfolio]
, each iteration but the first you are searching for the other Domains, that are obviously not present anymore.
Do something like:
for portfolio in portfolios:
df_temp=df.loc[df['Domain']==portfolio].copy()
ds=df_temp['Course1 completion']
count = ds.value_counts()
status = ds.value_counts(normalize=True).mul(100).round(2)
print('Status of portfolio : '+portfolio)
print(status)
print()
print('*********************************************')
del df_temp
The code df_temp=df.loc[df['Domain']==portfolio].copy()
will allow you to create a deep copy of the dataframe filtered by Domain
and print the status as you want. Each iteration, you are going to filter out the right Domain
and you will be able to print every status.
The code del df_temp
will delete each time the temporary dataframe and you will not override it every loop and - at the end of the for
, you will not need it anymore.
Upvotes: 1