U13-Forward
U13-Forward

Reputation: 71600

Why do we need three different ways to operating in pandas?

Why do we need three ways for operating?

(I use multiplication for examples)

First way:

df['a'] * 5

Second way:

df['a'].mul(5)

Third way:

df['a'].__mul__(5)

Isn't just two enough, no need an mul, I was wondering can it be like normal ways, like a integer

First way:

3 * 5

Second way:

(3).__mul__(5)

But in regular bases of an inetger:

(3).mul(5)

Would break.

I am just curious, why do we need this much stuff in Pandas, it's same with addition, subtraction and division.

Upvotes: 2

Views: 362

Answers (3)

gmds
gmds

Reputation: 19885

* and mul do the same thing, but __mul__ is different.

* and mul perform some checks before delegating to __mul__. There are two things that you should know about.

  1. NotImplemented

There is a special singleton value NotImplemented that is returned by a class's __mul__ in cases where it cannot handle the other operand. This then tells Python to try __rmul__. If that fails too, then a generic TypeError is raised. If you use __mul__ directly, you won't get this logic. Observe:

class TestClass:

    def __mul__(self, other):
        return NotImplemented

TestClass() * 1

Output:

TypeError: unsupported operand type(s) for *: 'TestClass' and 'int'

Compare that with this:

TestClass().__mul__(1)

Output:

NotImplemented

This is why, in general, you should avoid calling the dunder (magic) methods directly: you bypass certain checks that Python does.

  1. Derived class operator handling

Where you attempt to perform something like Base() * Derived(), where Derived inherits from Base, you would expect Base.__mul__(Derived()) to be called first. This can pose problems, since Derived.__mul__ is more likely to know how to handle such situations.

Therefore, when you use *, Python checks whether the right operand's type is more derived than the left's, and if so, calls the right operand's __rmul__ method directly.

Observe:

class Base:

    def __mul__(self, other):
        print('base mul')

class Derived(Base):

    def __rmul__(self, other):
        print('derived rmul')

Base() * Derived()

Output:

derived rmul

Notice that even though Base.__mul__ does not return NotImplemented and can clearly handle an object of type Derived, Python doesn't even look at it first; it delegates to Derived.__rmul__ immediately.

For completeness, there is one difference between * and mul, in the context of pandas: mul is a function, and can therefore be passed around in a variable and used independently. For example:

import pandas as pd

pandas_mul = pd.DataFrame.mul
pandas_mul(pd.DataFrame([[1]]), pd.DataFrame([[2]]))

On the other hand, this will fail:

*(pd.DataFrame([[1]]), pd.DataFrame([[2]]))

Upvotes: 3

Sam
Sam

Reputation: 611

First off, the third way (df['a'].__mul__(5)) should never be used since it's an internal method that's called by a Python class. In general, users don't touch any of the "dunder" methods.

Regarding the other two ways, the first way is obvious; you just multiply the thing. It's standard math.

The second way gets a bit more interesting. One example of how I've used that method is when the function you want to apply is a variable.

For example:

def pandas_math(series, func, val):
    return getattr(series, func)(val)

pandas_math(df['a'], 'mul', 5) will give the same result as df['a'].mul(5) but now you can pass mul as a variable, or whatever other function you want to use. It's much easier than hard-coding all the symbols.

Upvotes: 1

Netwave
Netwave

Reputation: 42746

Both the "magic method" __mul__ and the operator * are the same in the underliying python (* just calls __mul__), and as you pointed out it is the way python stadarized handles things. The other method mul is a method that you can use for mapping (use map) and avoiding using a lambda x, y: x*mul for example. Yes, you could still use __mul__ but usually it is not the purpose of those methods (__x__) to be used as normal functions and a simple mul makes the code more clear.

So, you dont really "need" it, but it is nice to have and use.

Upvotes: 1

Related Questions