tpapz
tpapz

Reputation: 107

Python matplotlib scientific axis formating

I've edited my question, I believe it is more didactic that way,

I'm plotting a chart using matplotlib and I'm facing issues with the formatting of the axes. I can't figure out how to force him to use the same scientific formatting all the time : In the bellow example, e4 (instead of e4 and e2). Also I would like to have always two decimals - Any idea ? the doc on that is not very extensive.

Creating a random df of data :

import numpy as np
import matplotlib.pyplot as plt
from pandas.stats.api import ols
import pandas as pd

mu, sigma = 100, 15
x = mu + sigma * np.random.randn(100000)
y = x *100 + (np.random.randn()*100)

Calculating the linear regression :

df = pd.DataFrame({'x':x,'y':y})
res = ols(y=df['y'], x=df['x'])
df['yhat'] = df['x']*res.beta[0] + res.beta[1]

Plotting :

plt.scatter(df['x'], df['y'])  
plt.plot(df['x'], df['yhat'], color='red') 
plt.title('Scatter graph with linear regression')              
plt.xlabel('X')
plt.ylabel('Y')
plt.ticklabel_format(style='sci', scilimits=(0,0))
plt.ylim(0)
plt.xlim(0)

Please find the output here

Upvotes: 0

Views: 1796

Answers (1)

burnpanck
burnpanck

Reputation: 2196

As far as I can tell, matplotlib does not offer exactly this options out of the box. The documentation is indeed sparse (Ticker API is the place to go). The Formatter classes are responsible for formatting the tick values. Out of the ones offered, only ScalarFormatter (the default formatter) offers scientific formatting, however, it does not allow the exponent or number of significant digits to be fixed. One alternative would be to use either FixedFormatter or FuncFormatter, which essentially allow you to freely choose the tick values (the former can be indirectly selected using plt.gca().set_xticklabels). However, none of them allow you to choose the so called offset_string which is the string displayed at the end of the axis, customary used for a value offset, but ScalarFormatter also uses it for the scientific multiplier.

Thus, my best solution consists of a custom formatter derived from ScalarFormatter, where instead of autodetecting order of magnitude and format string, those are just fixed by the used:

from matplotlib import rcParams
import matplotlib.ticker

if 'axes.formatter.useoffset' in rcParams:
    # None triggers use of the rcParams value
    useoffsetdefault = None
else:
    # None would raise an exception
    useoffsetdefault = True

class FixedScalarFormatter(matplotlib.ticker.ScalarFormatter):
    def __init__(self, format, orderOfMagnitude=0, useOffset=useoffsetdefault, useMathText=None, useLocale=None):
        super(FixedScalarFormatter,self).__init__(useOffset=useOffset,useMathText=useMathText,useLocale=useLocale)
        self.base_format = format
        self.orderOfMagnitude = orderOfMagnitude

    def _set_orderOfMagnitude(self, range):
        """ Set orderOfMagnitude to best describe the specified data range.

        Does nothing except from preventing the parent class to do something.
        """
        pass

    def _set_format(self, vmin, vmax):
        """ Calculates the most appropriate format string for the range (vmin, vmax).

        We're actually just using a fixed format string.
        """
        self.format = self.base_format
        if self._usetex:
            self.format = '$%s$' % self.format
        elif self._useMathText:
            self.format = '$\mathdefault{%s}$' % self.format   

Note that the default value of ScalarFormatter's constructor parameter useOffset changed at some point, mine tries to guess which one is the right one.

Attach this class to one or both axes of your plots as follows:

plt.gca().xaxis.set_major_formatter(FixedScalarFormatter('%.2f',4))
plt.gca().yaxis.set_major_formatter(FixedScalarFormatter('%.2f',4))

Upvotes: 1

Related Questions