pythonfunctional-programmingcurryingfunctoolspartial-application

Reputation: 4499

How does functools partial do what it does?

I am not able to get my head on how the partial works in functools. I have the following code from here:

>>> sum = lambda x, y : x + y
>>> sum(1, 2)
3
>>> incr = lambda y : sum(1, y)
>>> incr(2)
3
>>> def sum2(x, y):
    return x + y

>>> incr2 = functools.partial(sum2, 1)
>>> incr2(4)
5

Now in the line

incr = lambda y : sum(1, y)

I get that whatever argument I pass to incr it will be passed as y to lambda which will return sum(1, y) i.e 1 + y.

I understand that. But I didn't understand this incr2(4).

How does the 4 gets passed as x in partial function? To me, 4 should replace the sum2. What is the relation between x and 4?

Upvotes: 362

Answers (10)

bereal

Reputation: 34290

Roughly, partial does something like this (apart from keyword args support, etc):

def partial(func, *part_args):
    def wrapper(*extra_args):
        return func(*part_args, *extra_args)            
    return wrapper

So, by calling partial(sum2, 4) you create a new function (a callable, to be precise) that behaves like sum2, but has one positional argument less. That missing argument is always substituted by 4, so that partial(sum2, 4)(2) == sum2(4, 2)

As for why it's needed, there's a variety of cases. Just for one, suppose you have to pass a function somewhere where it's expected to have 2 arguments:

class EventNotifier(object):
    def __init__(self):
        self._listeners = []

    def add_listener(self, callback):
        ''' callback should accept two positional arguments, event and params '''
        self._listeners.append(callback)
        # ...
    
    def notify(self, event, *params):
        for f in self._listeners:
            f(event, params)

But a function you already have needs access to some third context object to do its job:

def log_event(context, event, params):
    context.log_event("Something happened %s, %s", event, params)

So, there are several solutions:

A custom object:

class Listener(object):
   def __init__(self, context):
       self._context = context

   def __call__(self, event, params):
       self._context.log_event("Something happened %s, %s", event, params)


 notifier.add_listener(Listener(context))

Lambda:

log_listener = lambda event, params: log_event(context, event, params)
notifier.add_listener(log_listener)

With partials:

context = get_context()  # whatever
notifier.add_listener(partial(log_event, context))

Of those three, partial is the shortest and the fastest. (For a more complex case you might want a custom object though).

Upvotes: 426

Dave

Reputation: 450

Just adding a use case.

functools.partial is very useful in multiprocessing. e.g., when you want to use a function that takes three ints over a list of ints and two constant ints:

from multiprocessing import Pool, cpu_count
from functools import partial

def run(a, b, c):
    return a+b+c

a = [1,2,3,4,5]
b = 10
c = 20

with Pool(cpu_count()) as pool:
    results = pool.map(partial(run, b=b, c=c), a)
print(results)

Out:

[31,32,33,34,35]

Equivalent without multiprocessing, using functools.partial and a for loop, e.g.,:

from functools import partial

def run(a, b, c):
    return a+b+c

a = [1,2,3,4,5]
b = 10
c = 20

p = partial(run, b=b, c=c)
results = []

for i in a:
    results.append(p(i))

print(results)

Equivalent without functools.partial or multiprocessing, e.g.,:

results = []
for i in a:
    results.append(run(i, b, c))
print(results)

The reason for doing this is multiprocessing.Pool.pool.map takes args = (func, iterable(object)) where func takes object, so a partial can be used where your func takes multiple arguments and in your use case the first argument varies and the rest are constants. Note: results are in the order of a when using pool.map; using pool.map_async may be faster, but results will have an unknown order.

In cases where len(a) is large, the multiprocessing example is more performant than the examples using for loops, as it executes pool.map(partial(run, b, c), a) in parallel using cpu_count() processes.

Upvotes: 0

sisanared

Reputation: 4287

Partials can be used to make new derived functions that have some input parameters pre-assigned

To see some real world usage of partials, refer to this really good blog post here

A simple but neat beginner's example from the blog, covers how one might use partial on re.search to make code more readable. re.search method's signature is:

search(pattern, string, flags=0)

By applying partial we can create multiple versions of the regular expression search to suit our requirements, so for example:

is_spaced_apart = partial(re.search, '[a-zA-Z]\s\=')
is_grouped_together = partial(re.search, '[a-zA-Z]\=')

Now is_spaced_apart and is_grouped_together are two new functions derived from re.search that have the pattern argument applied(since pattern is the first argument in the re.search method's signature).

The signature of these two new functions(callable) is:

is_spaced_apart(string, flags=0)     # pattern '[a-zA-Z]\s\=' applied
is_grouped_together(string, flags=0) # pattern '[a-zA-Z]\=' applied

This is how you could then use these partial functions on some text:

for text in lines:
    if is_grouped_together(text):
        some_action(text)
    elif is_spaced_apart(text):
        some_other_action(text)
    else:
        some_default_action()

You can refer the link above to get a more in depth understanding of the subject, as it covers this specific example and much more..

Upvotes: 60

user8234870

Reputation:

For those who are wondering how the partial function works, consider the implementation of the my_partial function, which has the same functionality as the functools.partial function:

my_partial function

def my_partial(func, *_args, **_kwargs):
    def wrapper(*args, **kwargs):
        return func(*(_args + args), **dict(_kwargs, **kwargs))
    return wrapper

The my_partial function takes func as an argument along with *_args (list arguments) and **_kwargs (keyword arguments). Using _args and _kwargs, we can fix initial arguments to the my_partial function along with a function.

Inside the my_partial function, we have another function, wrapper. When the outer function(my_partial) is called with a function as an argument with some optional arguments, we return the wrapper function, which also takes list arguments and keyword arguments.

Since both my_partial and wrapper functions are taking arguments this way, we can pass arguments to our function twice. First, we call my_partial and pass a function with optional arguments, and then the my_partial function returns the wrapper function which also takes option arguments. This way we can pass arguments twice.

The wrapper function combines positional arguments that are passed to my_partial with its arguments. The wrapper function calls the function (func) that was passed as an argument to my_partial.

In wrapper, we return func(*(_args + args), **dict(_kwargs, **kwargs)). Here, _args and args are both tuples, and + is the concatenation operator, which combines elements from _args and args. Notice we have mentioned _args before args because _args has the arguments passed to my_partial, and since it was passed first, we must have it first. Here, dict(**_kwargs, **kwargs) combines two dictionaries into a single dictionary.

Use Case:

Often, a function might need many arguments, and suppose we have to use that function repeatedly. In that case, we can wrap the function with partial, specifying fixed arguments that we may have to pass multiple times to the function.

from functools import partial

def multiply(a, b):
    return a * b

double = partial(multiply, 2)

print(double(10))

Output:

Notice in the above statement, partial(multiply, 2), we are specifying that the argument a in multiply will be 2. So, when we have to double a value, we need not pass the value 2 to the multiply function multiple times.

This was indeed a simple function (multiply), but often there might be situations where we have to pass many arguments to a function. In that case, we can use partial to specify the arguments that we may have to pass multiple times.

Upvotes: 0

Sandipan Dey

Reputation: 23129

Adding couple of case from machine learning where the functional programming currying with functools.partial can be quite useful:

Build multiple models on the same dataset

the following example shows how linear regression, support vector machine and random forest regression models can be fitted on the same diabetes dataset, to predict the target and compute the score.

The (partial) function classify_diabetes() is created from the function classify_data() by currying (using functools.partial()). The later function does not require the data to be passed anymore and we can straightaway pass only the instances of the classes for the models.

from functools import partial
from sklearn.linear_model import LinearRegression
from sklearn.svm import SVR
from sklearn.ensemble import RandomForestRegressor
from sklearn.datasets import load_diabetes

def classify_data(data, model):
    reg = model.fit(data['data'], data['target'])
    return model.score(data['data'], data['target'])

diabetes = load_diabetes()
classify_diabetes = partial(classify_data, diabetes) # curry
for model in [LinearRegression(), SVR(), RandomForestRegressor()]:
    print(f'model {type(model).__name__}: score = {classify_diabetes(model)}')

# model LinearRegression: score = 0.5177494254132934
# model SVR: score = 0.2071794500005485
# model RandomForestRegressor: score = 0.9216794155402649

Setting up the machine learning pipeline

Here the function pipeline() is created with currying which already uses StandardScaler() to preprocess (scale / normalize) the data prior to fitting the model on it, as shown in the next example:

from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler

pipeline = partial(make_pipeline, StandardScaler()) # curry    
for model in [LinearRegression(), SVR(), RandomForestRegressor()]:
    print(f"model {type(model).__name__}: " \
          f"score = {pipeline(model).fit(diabetes['data'], diabetes['target'])\
                                 .score(diabetes['data'], diabetes['target'])}")

# model LinearRegression: score = 0.5177494254132934
# model SVR: score = 0.2071794500005446
# model RandomForestRegressor: score = 0.9180227193805106

Upvotes: 3

Ruthvik Vaila

Reputation: 537

This answer is more of an example code. All the above answers give good explanations regarding why one should use partial. I will give my observations and use cases about partial.

from functools import partial
 def adder(a,b,c):
    print('a:{},b:{},c:{}'.format(a,b,c))
    ans = a+b+c
    print(ans)
partial_adder = partial(adder,1,2)
partial_adder(3)  ## now partial_adder is a callable that can take only one argument

Output of the above code should be:

a:1,b:2,c:3
6

Notice that in the above example a new callable was returned that will take parameter (c) as it's argument. Note that it is also the last argument to the function.

args = [1,2]
partial_adder = partial(adder,*args)
partial_adder(3)

Output of the above code is also:

a:1,b:2,c:3
6

Notice that * was used to unpack the non-keyword arguments and the callable returned in terms of which argument it can take is same as above.

Another observation is: Below example demonstrates that partial returns a callable which will take the undeclared parameter (a) as an argument.

def adder(a,b=1,c=2,d=3,e=4):
    print('a:{},b:{},c:{},d:{},e:{}'.format(a,b,c,d,e))
    ans = a+b+c+d+e
    print(ans)
partial_adder = partial(adder,b=10,c=2)
partial_adder(20)

Output of the above code should be:

a:20,b:10,c:2,d:3,e:4
39

Similarly,

kwargs = {'b':10,'c':2}
partial_adder = partial(adder,**kwargs)
partial_adder(20)

Above code prints

a:20,b:10,c:2,d:3,e:4
39

I had to use it when I was using Pool.map_async method from multiprocessing module. You can pass only one argument to the worker function so I had to use partial to make my worker function look like a callable with only one input argument but in reality my worker function had multiple input arguments.

Upvotes: 4

MSK

Reputation: 51

Also worth to mention, that when partial function passed another function where we want to "hard code" some parameters, that should be rightmost parameter

def func(a,b):
    return a*b
prt = partial(func, b=7)
    print(prt(4))
#return 28

but if we do the same, but changing a parameter instead

def func(a,b):
    return a*b
 prt = partial(func, a=7)
    print(prt(4))

it will throw error, "TypeError: func() got multiple values for argument 'a'"

Upvotes: 3

Hanzhou Tang

Reputation: 391

In my opinion, it's a way to implement currying in python.

from functools import partial
def add(a,b):
    return a + b

def add2number(x,y,z):
    return x + y + z

if __name__ == "__main__":
    add2 = partial(add,2)
    print("result of add2 ",add2(1))
    add3 = partial(partial(add2number,1),2)
    print("result of add3",add3(1))

The result is 3 and 4.

Upvotes: 19

Alex Fortin

Reputation: 2435

short answer, partial gives default values to the parameters of a function that would otherwise not have default values.

from functools import partial

def foo(a,b):
    return a+b

bar = partial(foo, a=1) # equivalent to: foo(a=1, b)
bar(b=10)
#11 = 1+10
bar(a=101, b=10)
#111=101+10

Upvotes: 101

doug

Reputation: 70068

partials are incredibly useful.

For instance, in a 'pipe-lined' sequence of function calls (in which the returned value from one function is the argument passed to the next).

Sometimes a function in such a pipeline requires a single argument, but the function immediately upstream from it returns two values.

In this scenario, functools.partial might allow you to keep this function pipeline intact.

Here's a specific, isolated example: suppose you want to sort some data by each data point's distance from some target:

# create some data
import random as RND
fnx = lambda: RND.randint(0, 10)
data = [ (fnx(), fnx()) for c in range(10) ]
target = (2, 4)

import math
def euclid_dist(v1, v2):
    x1, y1 = v1
    x2, y2 = v2
    return math.sqrt((x2 - x1)**2 + (y2 - y1)**2)

To sort this data by distance from the target, what you would like to do of course is this:

data.sort(key=euclid_dist)

but you can't--the sort method's key parameter only accepts functions that take a single argument.

so re-write euclid_dist as a function taking a single parameter:

from functools import partial

p_euclid_dist = partial(euclid_dist, target)

p_euclid_dist now accepts a single argument,

>>> p_euclid_dist((3, 3))
  1.4142135623730951

so now you can sort your data by passing in the partial function for the sort method's key argument:

data.sort(key=p_euclid_dist)

# verify that it works:
for p in data:
    print(round(p_euclid_dist(p), 3))

    1.0
    2.236
    2.236
    3.606
    4.243
    5.0
    5.831
    6.325
    7.071
    8.602

Or for instance, one of the function's arguments changes in an outer loop but is fixed during iteration in the inner loop. By using a partial, you don't have to pass in the additional parameter during iteration of the inner loop, because the modified (partial) function doesn't require it.

>>> from functools import partial

>>> def fnx(a, b, c):
      return a + b + c

>>> fnx(3, 4, 5)
      12

create a partial function (using keyword arg)

>>> pfnx = partial(fnx, a=12)

>>> pfnx(b=4, c=5)
     21

you can also create a partial function with a positional argument

>>> pfnx = partial(fnx, 12)

>>> pfnx(4, 5)
      21

but this will throw (e.g., creating partial with keyword argument then calling using positional arguments)

>>> pfnx = partial(fnx, a=12)

>>> pfnx(4, 5)
      Traceback (most recent call last):
      File "<pyshell#80>", line 1, in <module>
      pfnx(4, 5)
      TypeError: fnx() got multiple values for keyword argument 'a'

another use case: writing distributed code using python's multiprocessing library. A pool of processes is created using the Pool method:

>>> import multiprocessing as MP

>>> # create a process pool:
>>> ppool = MP.Pool()

Pool has a map method, but it only takes a single iterable, so if you need to pass in a function with a longer parameter list, re-define the function as a partial, to fix all but one:

>>> ppool.map(pfnx, [4, 6, 7, 8])

Upvotes: 186

How does functools partial do what it does?

Answers (10)

my_partial function

Use Case:

Related Questions