Reputation: 61134
Background:
I'd like to slice a pandas dataframe in elements of a given row length, and perform calculations on them.
pandas.DataFrame.rolling will let me do that, but seemingly only with built-in functions like sum()
in the example df.rolling(2, win_type='triang').sum()
. I would also like to plot these subsets (I'm able to do that by slicing and some For Loops, but it's a bit slow).
What I've found out:
From How can I get the source code of a Python function? I've learnt that I can read source code using pandas.DataFrame.rolling??
which will give me this:
But trying to dig deeper from here using for example rolling??
seems futile:
So, is it possible to reference the underlying functions of pandas.DataFrame.rolling
somehow, or is this where it ends using Python? I guess so since the docs state that pandas is written in Cython or C, but I'm really curious about this so I'd like to ask about this here as well.
Thank you for any suggestions!
Upvotes: 3
Views: 5049
Reputation: 13225
Good/bad news: your suffering totally does not end there.
[side-note]
It is easy to not-find where source code is located in your system, especially if you use extra layers like Anaconda.
When in doubt, you can check the __file__
attribute in an interactive shell:
import pandas
pandas.__file__
>>> 'C:\\Users\\xy\\AppData\\Local\\Continuum\\Anaconda3\\lib\\site-packages\\pandas\\__init__.py'
[/side-note]
If you look up that actual piece of code, it comes from NDFrame
in pandas/core/generic.py, and there is an import line just before it:
from pandas.core import window as rwindow
@Appender(rwindow.rolling.__doc__)
def rolling(self, window, min_periods=None, freq=None, center=False,
win_type=None, on=None, axis=0, closed=None):
axis = self._get_axis_number(axis)
return rwindow.rolling(self, window=window,
min_periods=min_periods, freq=freq,
center=center, win_type=win_type,
on=on, axis=axis, closed=closed)
So your adventure continues in pandas/core/window.py where rolling
is somewhere at the very end:
def rolling(obj, win_type=None, **kwds):
from pandas import Series, DataFrame
if not isinstance(obj, (Series, DataFrame)):
raise TypeError('invalid type: %s' % type(obj))
if win_type is not None:
return Window(obj, win_type=win_type, **kwds)
return Rolling(obj, **kwds)
And all of Window
, Rolling
, and their parent classes (_Window
, _Rolling_and_Expanding
, _Rolling
- and this one also comes from _Window
) stretch over thousands of lines in the same file.
Upvotes: 1
Reputation: 1696
This is not an answer on how to read the source code, but on how to get your stated problem solved:
Use apply on rolling. For example, try df.rolling(2, win_type='triang').apply(yourfunc, args=(), kwargs={})
from the docs, yourfunc
Must produce a single value from an ndarray input *args and **kwargs are passed to the function
This is the better approach, since you shouldn't take pandas source and use it copy-pasted and edited in your code if not really needed (there are some bugfixes, it may be outdated in some time, etc..). Here we have the possibility to use an own function already implemented.
Upvotes: 5
Reputation: 3026
The Pandas source code is open source and currently available on GitHub at: https://github.com/pandas-dev/pandas
You could also look here at the contributors' guide for an idea of how the code is laid out: https://pandas.pydata.org/pandas-docs/stable/contributing.html
And in the docs there are links to the code that the docs refer to (like so)
Upvotes: 3