Reputation: 351
The following is the GitHub link for Python's Pandas package.
https://github.com/pandas-dev/pandas
I would like to find the source code for a specific method (for instance, iterrows). What would be the file path for this?
Upvotes: 0
Views: 1784
Reputation: 96172
Python, in general, is easily introspect-able. You can use the inspect
module if you want to do this programatically. so for example:
In [8]: import pandas as pd
In [9]: import inspect
In [10]: pd.DataFrame.iterrows
Out[10]: <function pandas.core.frame.DataFrame.iterrows(self)>
In [11]: inspect.getsourcefile(pd.DataFrame.iterrows)
Out[11]: '/Users/juan/anaconda3/envs/py38/lib/python3.8/site-packages/pandas/core/frame.py'
So you can go to pandas/core/frame.py
. Note, this won't always work if it is, say, a method written in C as an extension. But it should for Python source code. In fact, you can even get the source code lines using inspect.getsourcelines
, which returns a tuple of lines, line_number
:
In [12]: inspect.getsourcelines(pd.DataFrame.iterrows)
Out[12]:
([' def iterrows(self):\n',
' """\n',
' Iterate over DataFrame rows as (index, Series) pairs.\n',
'\n',
' Yields\n',
' ------\n',
' index : label or tuple of label\n',
' The index of the row. A tuple for a `MultiIndex`.\n',
' data : Series\n',
' The data of the row as a Series.\n',
'\n',
' it : generator\n',
' A generator that iterates over the rows of the frame.\n',
'\n',
' See Also\n',
' --------\n',
' itertuples : Iterate over DataFrame rows as namedtuples of the values.\n',
' items : Iterate over (column name, Series) pairs.\n',
'\n',
' Notes\n',
' -----\n',
'\n',
' 1. Because ``iterrows`` returns a Series for each row,\n',
' it does **not** preserve dtypes across the rows (dtypes are\n',
' preserved across columns for DataFrames). For example,\n',
'\n',
" >>> df = pd.DataFrame([[1, 1.5]], columns=['int', 'float'])\n",
' >>> row = next(df.iterrows())[1]\n',
' >>> row\n',
' int 1.0\n',
' float 1.5\n',
' Name: 0, dtype: float64\n',
" >>> print(row['int'].dtype)\n",
' float64\n',
" >>> print(df['int'].dtype)\n",
' int64\n',
'\n',
' To preserve dtypes while iterating over the rows, it is better\n',
' to use :meth:`itertuples` which returns namedtuples of the values\n',
' and which is generally faster than ``iterrows``.\n',
'\n',
' 2. You should **never modify** something you are iterating over.\n',
' This is not guaranteed to work in all cases. Depending on the\n',
' data types, the iterator returns a copy and not a view, and writing\n',
' to it will have no effect.\n',
' """\n',
' columns = self.columns\n',
' klass = self._constructor_sliced\n',
' for k, v in zip(self.index, self.values):\n',
' s = klass(v, index=columns, name=k)\n',
' yield k, s\n'],
860)
Generally, also, you can just print the function/method and look at the information in the string representation, and pretty much figure it out:
In [19]: pd.DataFrame.iterrows
Out[19]: <function pandas.core.frame.DataFrame.iterrows(self)>
So just from that you could see it is in pandas.core.frame
.
Upvotes: 4
Reputation: 1081
This site and this one have a button with a link (source
). I usually just google the method I need and add the word source
Upvotes: 3