Reputation: 2533
I have some code that retrieves data from an internal database, then concatenates the data as a Pandas dataframe. The issue I'm running into as that there are duplicate indexes in the dataframe. Here is the full code:
from jira import JIRA
import pandas as pd
cert_path = 'C:\\cert.crt'
start_date = '2020-10-01'
end_date = '2020-10-31'
# three different instances (each with their own schema)
a_session = JIRA(server='https://jira.myinstance-A.com', options={'verify': cert_path}, kerberos=True)
b_session = JIRA(server='https://jira.myinstance-B.com', options={'verify': cert_path}, kerberos=True)
c_session = JIRA(server='https://jira.myinstance-C.com', options={'verify': cert_path}, kerberos=True)
# define Jira queries
query_1 = 'project = \"Test Project 1\" and issuetype = Incident and resolution = Resolved and updated >= {} and updated <= {}'.format(start_date, end_date)
query_2 = 'project = \"Test Project 2\" and issuetype = Incident and resolution = Resolved and updated >= {} and updated <= {}'.format(start_date, end_date)
query_3 = 'project = \"Test Project 3\" and issuetype = Defect and resolution = Resolved and releasedate >= {} and releasedate <= {}'.format(start_date, end_date)
query_4 = 'project = \"Test Project 4\" and issuetype = Enhancement and resolution = Done and completed >= {} and completed <= {}'.format(start_date, end_date)
# fetch all issues from a given session for a given query
block_size = 100
block_num = 0
def get_all_issues(session, query):
block_size = 50
block_num = 0
start = 0
all_issues = []
while True:
issues = session.search_issues(query, start, block_size)
if len(issues) == 0:
# No more issues
break
start += len(issues)
for issue in issues:
all_issues.append(issue)
issues = pd.DataFrame(issues)
for issue in all_issues:
d = {
'jira_key' : issue.key,
'issue_type' : issue.fields.type,
'creator' : issue.fields.creator,
'resolution' : issue.fields.resolution
}
fields = issue.fields # For brevity
if hasattr(fields, "custom_field_123"):
d['system_change'] = fields.custom_field_123
if hasattr(fields, "custom_field_456"):
d['system_resources'] = fields.custom_field_456
if hasattr(fields, "custom_field_789"):
d['system_backup'] = fields.custom_field_789
issues = issues.append(d, ignore_index=True)
return issues
# list of queries, and the corresponding backend
queries = [
(a_session, query_1),
(a_session, query_2),
(b_session, query_3),
(c_session, query_4),
]
# loop over each pair of session and query, calling the get_all_issues function, and save the dataframe we get each time
dataframes = []
for session, query in queries:
dataframe = get_all_issues(session, query)
dataframes.append(dataframe)
# concatenate all data frames
df_concat = pd.concat(dataframes)
# derive the business units from the project codes
df_concat['business_unit'] = np.where(df_concat['jira_key'].str.contains('MER'), 'Mercedes',
np.where(df_concat['jira_key'].str.contains('HON'), 'Honda',
np.where(df_concat['jira_key'].str.contains('AST'), 'Aston Martin', '*ERROR*')))
Here is what some sample data would look like (notice the duplicate indexes):
jira key issue_type creator resolution system_change system_resources system_backup business_unit
0 MER-361 Incident Smith, J Resolved Saturn High NaN Mercedes
1 MER-362 Enhancement Jones, T In Progress NaN Medium Not Applicable Mercedes
2 MER-363 Incident Ng, V Resolved Saturn NaN Not Applicable Mercedes
3 MER-364 Incident Jones, T Resolved NaN NaN Not Applicable Mercedes
0 AST-022 Incident Smith, J Resolved Saturn High NaN Astin Martin
1 AST-023 Incident Smith, J Resolved Saturn High NaN Astin Martin
2 AST-024 Incident Jones, T Resolved Saturn High NaN Astin Martin
0 HON-124 Incident Smith, J In Progress NaN Low NaN Honda
1 HON-125 Incident Smith, J Resolved Saturn High NaN Honda
2 HIN-126 Incident Jones, T In Progress Saturn Low NaN Honda
What is the best way to "arrange" the indexes such that they are sequential? (as follows)
jira key issue_type creator resolution system_change system_resources system_backup business_unit
0 MER-361 Incident Smith, J Resolved Saturn High NaN Mercedes
1 MER-362 Enhancement Jones, T In Progress NaN Medium Not Applicable Mercedes
2 MER-363 Incident Ng, V Resolved Saturn NaN Not Applicable Mercedes
3 MER-364 Incident Jones, T Resolved NaN NaN Not Applicable Mercedes
4 AST-022 Incident Smith, J Resolved Saturn High NaN Astin Martin
5 AST-023 Incident Smith, J Resolved Saturn High NaN Astin Martin
6 AST-024 Incident Jones, T Resolved Saturn High NaN Astin Martin
7 HON-124 Incident Smith, J In Progress NaN Low NaN Honda
8 HON-125 Incident Smith, J Resolved Saturn High NaN Honda
9 HIN-126 Incident Jones, T In Progress Saturn Low NaN Honda
Thanks in advance!
Upvotes: 0
Views: 58