How to filter dataframe and get the total number of records based on another column in pandas dataframe?

Question

I have collected jira data and written it into a csv file where the dataframe which I have gathered has Ticket_No, Request Type, Assignee Name and Status as the columns. What I want to achieve is count the number of In Progress, Open and Closed tickets in order to visualize it in a plotly bar graph with Assignees on the x-axis and Status on the y-axis. Is it possible? Below is the sample dataset. The plotly is not plotting anything for the x and y axis that I have given (obviously).

Note: Whereever there is a 'New' in the status there is blank in the Assignee Name

N;Issue key;Request Type;Assignee name;Status
0;JIRA_1;Bug;User1;Awaiting Review
1;JIRA_2;Improvement;User2;In Progress
2;JIRA_3;Improvement;;New
3;JIRA_4;Improvement;User3;Closed
4;JIRA_5;Improvement;User3;Awaiting Review
5;JIRA_6;Improvement;;New
6;JIRA_7;Bug;User5;Closed
7;JIRA_8;Bug;User5;Closed

import plotly.express as px
fig = px.bar(data_frame=df, x=['Request Type'], y=['Status'])

So, I want to show in the plot that.

User1: 1 ticket with status Awaiting Review

User3: 4 tickets with status 2: New, 1: Awaiting Review, 2: New status

and so on

Ibrahim Sherif · Accepted Answer

In order to achieve what you want you can't do it on the dataframe directly. You need first to count each categories before using plotly, px.bar expects a category and its count. The following code produces the plot shown below which does what you requested.

import pandas as pd
import plotly.express as px


df = pd.read_csv("data_jira.csv", delimiter=';')
dff = df.groupby(['Assignee name', 'Status'])['N'].count().reset_index(drop=0).rename(columns={'N': 'Count'})

# Change barmode to 'overlay' for stacked barchart
px.bar(data_frame=dff, x='Assignee name', y='Count', color='Status', barmode='group')

How to filter dataframe and get the total number of records based on another column in pandas dataframe?

Answers (1)

Related Questions