eklon
eklon

Reputation: 127

Matplotlib axes only with values on Pandas Dataframe

I'm working on a backlog chart since last year, and now with the new year,and now I'm facing this issue:

Bugged Chart

I had to multiply the number of the year to keep the X axis keep rolling to the right. But after that, I got this blanked interval on X axis from 202052 (concatenate year + week of the year number) until 202099~ .

My indexes doesn't have these values. As below :

(Int64Index([202026, 202027, 202028, 202029, 202030, 202031, 202032, 202033,
             202035, 202036, 202037, 202038, 202040, 202041, 202043, 202044,
             202045, 202046, 202047, 202048, 202049, 202050, 202051, 202052,
             202101, 202102],
            dtype='int64'),
 Int64Index([202026, 202027, 202028, 202029, 202030, 202031, 202032, 202033,
             202034, 202035, 202036, 202037, 202038, 202040, 202041, 202043,
             202044, 202045, 202046, 202047, 202048, 202049, 202050, 202051,
             202052, 202101, 202102],
            dtype='int64'),
 Int64Index([202026, 202027, 202028, 202029, 202030, 202031, 202032, 202033,
             202034, 202035, 202036, 202037, 202038, 202040, 202041, 202043,
             202044, 202045, 202046, 202047, 202048, 202049, 202050, 202051,
             202052, 202101, 202102],
            dtype='int64'))

How can I drop these values?

Thank you!

EDIT: Adding full code


import matplotlib.pyplot as plt
import pandas as pd
from datetime import datetime, timedelta
from matplotlib.lines import Line2D
import matplotlib.dates as mdates
import matplotlib.cbook as cbook
from matplotlib.ticker import MaxNLocator

%matplotlib inline

df = pd.read_csv(
    "/home/eklon/Downloads/Venturus/NetSuite/Acompanhamento/130121/MelhoriasNetSuite130121.csv", delimiter=';')


df.columns = df.columns.str.replace(' ', '')    

df['CreatedDate'] = pd.to_datetime(df['CreatedDate'])
df['CompletedDate'] = pd.to_datetime(df['CompletedDate'])
df['DayCompleted'] = df['CompletedDate'].dt.dayofweek
df['DayCreated'] = df['CreatedDate'].dt.dayofweek
df['WeekCreated'] = df['CreatedDate'].dt.isocalendar().week
df['WeekCompleted'] = df['CompletedDate'].dt.isocalendar().week
df['YearCreated'] = df['CreatedDate'].dt.year
df['YearCompleted'] = df['CompletedDate'].dt.year
df['firstCompletedDate'] = df.CompletedDate - df.DayCompleted * timedelta(days=1)
df['firstCreatedDate'] = df.CreatedDate - df.DayCreated * timedelta(days=1)

df['YearWeekCreated'] = df['YearCreated']*100 + df['WeekCreated']
df['YearWeekCompleted'] = df['YearCompleted']*100 + df['WeekCompleted']


df_done = df[df['Progress'] == 'Completed']
df_open = df[df['Progress'] != 'Completed']
df_todo = df[df['BucketName'] == 'To do']
df_doing = df[df['BucketName'] == 'Doing']
df_consult = df[df['BucketName'] == 'Em andamento RSM']
df_open['Priority'].value_counts().sort_index()
df['Priority'].sort_index()

df_backlog_created = df['YearWeekCreated'].value_counts().sort_index()
df_backlog_completed = df['YearWeekCompleted'].value_counts().sort_index()
df_backlog = df_backlog_created.cumsum() - df_backlog_completed.cumsum()




#============================================================================


qtd_created = df['YearWeekCreated'].value_counts().sort_index()
idx_created = qtd_created.index
qtd_completed = df['YearWeekCompleted'].value_counts().sort_index()
idx_completed = qtd_completed.index 
qtd_backlog = df_backlog
idx_backlog = qtd_backlog.index

idx_completed = idx_completed.astype(int)


fig, ax = plt.subplots(figsize=(14,10))



#plt.figure(figsize=(14,10))
ax.plot(idx_created, list(qtd_created), label="Iniciadas", color="r")
ax.plot(idx_completed, list(qtd_completed), label="Completadas", color="y", linewidth=3)
ax.bar(idx_backlog, qtd_backlog, label="Backlog", color="b")
ax.legend(['Novas', 'Fechadas', 'Backlog'])



x=[1,2,3]
y=[9,8,7]


for a,b in zip(idx_created, qtd_created): 
    plt.text(a, b, str(b), fontsize=12, color='w', bbox=dict(facecolor='red', alpha=0.5), horizontalalignment='center')




for a,b in zip(idx_backlog, qtd_backlog): 
    plt.text(a, b, str(b), fontsize=12, color='w', bbox=dict(facecolor='blue', alpha=0.5), horizontalalignment='center')



for a,b in zip(idx_completed, qtd_completed): 
    plt.text(a, b, str(b), fontsize=12, color='black', bbox=dict(facecolor='yellow', alpha=0.5))


plt.title('Backlog', fontsize= 20)


Upvotes: 0

Views: 151

Answers (2)

Stef
Stef

Reputation: 30579

What you want to do is called index plotting (just pass the y values to plot, no x values), so you should use an IndexLocator. In the following example you set a tick every 4th row:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as mt

np.random.seed(0)
idx = [202026, 202027, 202028, 202029, 202030, 202031, 202032, 202033,
             202035, 202036, 202037, 202038, 202040, 202041, 202043, 202044,
             202045, 202046, 202047, 202048, 202049, 202050, 202051, 202052,
             202101, 202102]
df = pd.DataFrame(np.random.rand(len(idx)), index=idx, columns=['col1'])

fig,ax = plt.subplots()
ax.plot(df.col1.to_numpy())
ax.xaxis.set_major_locator(mt.IndexLocator(4,0))
ax.xaxis.set_ticklabels(df.iloc[ax.get_xticks()].index)

enter image description here

Another possibility is to use a FuncFormatter, especially if you want to zoom your chart as it will dynamically format the autolocator ticks:

ax.xaxis.set_major_formatter(mt.FuncFormatter(lambda x,_: f'{df.index[int(x)]}' if x in range(len(df)) else ''))

enter image description here

Upvotes: 1

antoine
antoine

Reputation: 672

This is not direct fix for your code, but the principle should be the same. I will create a fake dataframe and illustrate the problem and a solution.

Current empty space problem:

labels = [202026, 202027, 202028, 202029, 202030, 202031, 202032, 202033,
             202034, 202035, 202036, 202037, 202038, 202040, 202041, 202043,
             202044, 202045, 202046, 202047, 202048, 202049, 202050, 202051,
             202052, 202101, 202102]
y = np.random.rand(len(labels))

# old approach, will have empty space
_, ax = plt.subplots(1,1)
ax.plot(labels, y)

example plot with spaces

Suggested solution:

labels = [202026, 202027, 202028, 202029, 202030, 202031, 202032, 202033,
             202034, 202035, 202036, 202037, 202038, 202040, 202041, 202043,
             202044, 202045, 202046, 202047, 202048, 202049, 202050, 202051,
             202052, 202101, 202102]
y = np.random.rand(len(labels))

# suggested by dummy index
x_idx = range(len(labels))
_, ax = plt.subplots(1,1)
ax.plot(x_idx, y)
ax.set_xticks(x_idx[::5])
ax.set_xticklabels(labels[::5])

plot without empty space

Hope this works work for you. Kr.

Upvotes: 1

Related Questions