Reputation: 1553
I recently wrote a python script for someone, where I converted a pandas dataframe's index into a list using to_list()
. However, this does not work for them, as they get: AttributeError: 'Index' object has no attribute 'to_list'
with their python interpretter.
I did some searching and found that there is also tolist()
that seems to do the same as to_list()
: searching on Pandas documentation finds both, with word-for-word identical description.
On the other hand, the documentation of Index mentions only to_list()
.
So I wonder whether there is a difference between the two
Upvotes: 24
Views: 43543
Reputation: 6763
I'm afraid Pandas created a confusion here, because .to_list()
is not / no longer an alias in NumPy (it will raise an AttributeError
), so you should use .tolist()
(and other "to" methods) without an underscore. While I'm not denying that the to_list
alias might have worked, it was undocumented even as far as the NumPy docs on numpy.ndarray.tolist go back, i.e. in v1.13 (released in 2017).
Upvotes: -1
Reputation: 21
In Python;
tolist() is a method of NumPy arrays that returns a list representation of the array.
to_list() is a method of Pandas dataframes that returns a list representation of the dataframe.
Although both methods return the same output, their differences lie in their origins and compatibility. *to_list()*
is Pandas-specific
.
Upvotes: 1
Reputation: 19
With all the above answers, only thing which we can get is that to_list()
is just a alias of original tolist()
.
I do noticed a difference though:
while converting a pandas.DataFrame()
created with dict consisting numpy.ndarray
, to_list
throws an error AttributeError: 'numpy.ndarray' object has no attribute 'to_list'
, whereas tolist
did the job.
dict = {'Name':['Martha', 'Tim', 'Rob', 'Georgia'],
'Maths':[87, 91, 97, 95],
'Science':[83, 99, 84, 76]
}
df = pd.DataFrame(dict)
output['OP.pool_results'] = df.values.to_list()
output:
Traceback (most recent call last):
File "C:\Users\hshrima\AppData\Local\Temp/ipykernel_24028/1619690827.py", line 1, in <module>
df.values[0].to_list()
AttributeError: 'numpy.ndarray' object has no attribute 'to_list'
But, if I use tolist
, following is the result
df.values.tolist()
[['Martha', 87, 83], ['Tim', 91, 99], ['Rob', 97, 84], ['Georgia', 95, 76]]
Here, tolist
refers to ndarray.tolist
, whereas the above to_list
refers to IndexOpsMixin.tolist
from pandas
Upvotes: 1
Reputation: 14233
If you check the source code, you will see that right after tolist()
code there is line to_list = tolist
, so to_list
is just alias for tolist
EDIT: to_list()
was added in ver. 0.24.0, see issue#8826
Upvotes: 37
Reputation: 1553
Expanding on buran's answer (too much text to fit into a comment):
Using git blame
on the pointed-out source code, one can see that to_list()
was indeed added in December 2018 as an alias to tolist()
. It was commited as an enhancement to resolve issue 8826, i.e., to make it more inline with to_period()
and to_timestamp()
.
Moreover, changes and comments in pandas/core/series.py
show that the original tolist()
is "semi-deprecated":
# tolist is not actually deprecated, just suppressed in the __dir__
_deprecations = generic.NDFrame._deprecations | frozenset(
['asobject', 'reshape', 'get_value', 'set_value',
'from_csv', 'valid'])
'from_csv', 'valid', 'tolist'])
Interesting to see how the system works...
Upvotes: 10
Reputation: 461
This question would need a pandas developer to be answered without doubts, but a reasonable guess would be that it serves as support for legacy versions. As you can see from the source code here, the 'source' button for to_list forwards to the same source code of tolist, and the two functions are simply aliases.
Upvotes: 4