hernanavella
hernanavella

Reputation: 5552

How to modify pandas df in order to adjust starting point of plot for series?

Im trying to plot a dataframe like this:

A = pd.DataFrame([[1, 5, 2, 8, 2], [2, 4, 4, 20, 2], [3, 3, 1, 20, 2], [4, 2, 2, 1, 0], 
              [5, 1, 4, -5, -4], [1, 5, 2, 2, -20], [2, 4, 4, 3, 0], [3, 3, 1, -1, -1], 
              [4, 2, 2, 0, 0], [5, 1, 4, 20, -2]],
             columns=['a', 'b', 'c', 'd', 'e'],
             index=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])


plt.plot(np.cumsum(A.transpose()))

It looks like this:

enter image description here

However, I would like the first print of the chart to start at 0 for all lines. I tried adding another column according to this, but didn't work. For some reason the index didn't change and kept the newly created column at the end in the plot.

A['s'] = 0
cols = list(A)
cols.insert(0, cols.pop(cols.index('s')))
A = A.loc[:, cols]
plt.plot(np.cumsum(A.transpose()))

enter image description here

Upvotes: 1

Views: 342

Answers (2)

NK_
NK_

Reputation: 391

You can use insert to add a new column with all 0's.

A.insert(0, '0', [0]*10)
  • The first 0 ist the position of your column, in this case the beginning of your dataframe.
  • '0' is the name of the column. As .plot sorts your columns, you either can use something that comes before your other columns (like probably '0') or you have to reorder your columns in your plot.
  • [0]*10 are the values of your new column.

enter image description here

Upvotes: 2

ImportanceOfBeingErnest
ImportanceOfBeingErnest

Reputation: 339560

Your approach is absolutely correct. The code from the question will produce the desired plot. However, only in matplotlib 2.2. In earlier versions matplotlib would automatically sort the categories alphabetically before plotting, such that row s is last in the axes.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

A = pd.DataFrame([[1, 5, 2, 8, 2], [2, 4, 4, 20, 2], [3, 3, 1, 20, 2], [4, 2, 2, 1, 0], 
              [5, 1, 4, -5, -4], [1, 5, 2, 2, -20], [2, 4, 4, 3, 0], [3, 3, 1, -1, -1], 
              [4, 2, 2, 0, 0], [5, 1, 4, 20, -2]],
             columns=['a', 'b', 'c', 'd', 'e'],
             index=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

A['s'] = 0
cols = list(A)
cols.insert(0, cols.pop(cols.index('s')))
A = A.loc[:, cols]

plt.plot(np.cumsum(A.transpose()))

plt.show()

enter image description here

In case you cannot use matplotlib 2.2, you may plot the values without labels and set the labels afterwards.

x = np.arange(len(A.columns))
y = np.cumsum(A.transpose()).values
plt.plot(x,y)
plt.xticks(x, A.columns)

Upvotes: 2

Related Questions