Reputation: 591
When I am calling the function describe without round brackets in my jupyter notebook the results are different, however I would expect an error message for the call without the bracket.
When searching I only found articles about describe()
, but nothing about describe
. I feel like a fool for asking this, because I am sure this is simple but I don't understand yet.
The code looks like this:
file_path = '../input/data.csv'
data = pd.read_csv(file_path)
data.describe #instead of data.describe()
The two results look like this:
Upvotes: 2
Views: 4032
Reputation: 1
data.describe()
- describes the dataframe
data.describe
- describes the python function or the pointer to the python function that would return the description of the dataframe when it is called
Upvotes: 0
Reputation: 5955
The output tells you what's happening, but it takes some decoding. When you call df.describe
, you get
bound method NDFrame.describe of <your dataframe
In other words,it's returning the description of the describe
method, which is bound
to your dataframe object
Though not exactly analogous since it's not bound to an object, it's similar to the following case, where we define a function that operates on the __main__
namespace, then call it both with and without parentheses
def myfunc():
return "hello world"
myfunc()
'hello world'
myfunc #no parentheses
<function __main__.myfunc()>
Note the second case, where it tells us it's a function, its name is myfunc
, and it's "bound" to the __main__
namespace rather than to an object (not exactly, but close enough)
Upvotes: 1
Reputation: 1121216
.describe
is the bound method. It's bound to your dataframe, and the representation of a bound method includes the repr()
output of whatever it is bound to.
You can see this at the start of the output:
<bound method NDFrame.describe of ...>
The rest is just the same string as what repr(data)
produces.
Note that Python interactive interpreter always echoes the representation of whatever the last expression produced (unless it produced None
). data.describe
produces the bound method, data.describe()
produces whatever that method was designed to produce.
You can create the same kind of output for any bound method:
>>> class Foo:
... def __repr__(self):
... return "[This is an instance of the Foo class]"
... def bar(self):
... return "This is what Foo().bar() produces"
...
>>> Foo()
[This is an instance of the Foo class]
>>> Foo().bar
<bound method Foo.bar of [This is an instance of the Foo class]>
>>> Foo().bar()
"This is what Foo().bar() produces"
Note that Foo()
has a custom __repr__
method, which is what is called to produce the representation of an instance.
You can see the same kind of output (the representation of the whole dataframe) for any method on the dataframe you don't actually call, e.g. data.sum
, data.query
, data.pivot
, or data.__repr__
.
A bound method is part of the process by which Python passes in the instance as the first argument when you call it, the argument usually named self
. It is basically a proxy object with references to the original function (data.describe.__func__
) and the instance to pass in before all other arguments (data.describe.__self__
). See the descriptor HOWTO for details on how binding works.
If you wanted to express the __repr__
implementation of a bound method as Python code, it would be:
def __repr__(self):
return f"<bound method {self.__func__.__qualname__} of {self.__self__!r}>"
Upvotes: 9
Reputation: 9681
At the risk of over-simplifying:
.describe
is a method which is part of the NDFrame
class, which can be called to get stats on your frame.
You use this method by calling the describe()
function.
For more detail, and an excellent low-level explanation - see Martijn's answer.
Upvotes: 2