Matt
Matt

Reputation: 27

Multiprocessing a for loop in Python within a function

This question is similar to How to use multiprocessing in a for loop - python and How to use multiprocessing in a for loop - python , but neither of these solves my problem. The function stateRecognizer() checks if a series of images exists on current screen, using a function getCoord(imgDir), and returns the corresponding state.

getCoord(key) returns an list of 4 integers. getCoord(key) returns None if the image wasn't found.

My for loop implementation

checks = {"loadingblack.png": 'loading',
          "loading.png": 'loading',
          "gear.png": 'home',
          "factory.png": 'factory',
          "bathtub.png": 'bathtub',
          "refit.png": 'refit',
          "supply.png": 'supply',
          "dock.png": 'dock',
          "spepage.png": 'spepage',
          "oquest.png": 'quest',
          "quest.png": 'quest'}

def stateRecognizer(hint=None):
    for key in checks:
       if (getCoord(key) is not None):
           return checks[key]

When I attempt to write another function and call it, it does not return the expected variable:

def stateChecker(key, value):
    if (getCoord(key) is not None):
        return value

def stateRecognizer():
    with Pool(multiprocessing.cpu_count()) as pool:
        result = pool.map(stateChecker, checks)

Outputs:

stateChecker() missing 1 required positional argument: 'value'

How do I pass in a dict to the function stateChecker?

Update 2: Thank you both @tdelaney and @Nathaniel Ford.

def stateChecker(key, value):
    if (getCoord(key) is not None):
        return value
def stateRecognizer():
    with Pool(multiprocessing.cpu_count()) as mp_pool:
        return mp_pool.starmap(stateChecker, checks.items())

The function now returns [None, None, None, None, 'bathtub', None, None, None, None, None, None] with slower processing speed (around 12 times slower).I am assuming each subprocess processes the entire dict per subprocess. Also, sometimes the function fails to read the JPEG image properly.

Premature end of JPEG file
Premature end of JPEG file
[None, None, None, None, None, None, None, None, None, None, None]
Elapsed time: 7.7098618000000005
Premature end of JPEG file
Premature end of JPEG file
[None, None, None, None, 'bathtub', None, None, None, None, None, None]
Elapsed time: 7.169349200000001

When with * before checks.items() or checks

    with Pool(multiprocessing.cpu_count()) as mp_pool:
        return mp_pool.starmap(stateChecker, *checks)

Exception raised:

Exception has occurred: TypeError
    starmap() takes from 3 to 4 positional arguments but 13 were given

Upvotes: 0

Views: 1033

Answers (2)

Nathaniel Ford
Nathaniel Ford

Reputation: 21230

There is a slightly uncommon behavior in Python:

>>> dx = {"a": 1, "b": 2}
>>> [print(i) for i in dx]
a
b

Essentially, only the key values are part of the iteration here. Whereas, if we use items() we see:

>>> dx = {"a": 1, "b": 2}
>>> [print(i) for i in dx]
a
b

When you call map on your pool, it is effectively using that first version. That means, rather than passing a key-value pair into stateChecker you are passing only the key. Thus your error 'missing 1 required positional argument'. The second value is missing.

By using starmap and items() we can get around this. As shown above, items will give an iterator of tuples (each a key-value pair from your dictionary).

def stateRecognizer():
    with Pool(multiprocessing.cpu_count()) as mp_pool:
        return mp_pool.starmap(stateChecker, checks.items())

starmap here refers to using the * operator:

>>> def f(a, b):
...   print(f"{a} is the key for {b}")
... 
>>> my_tuple = ("a", 1)
>>> f(*my_tuple)
a is the key for 1
>>> f(my_tuple)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: f() missing 1 required positional argument: 'b'

As you can see here, when used to pass values to a function, it 'unpacks' those values, slotting each value from the tuple (or list) into an argument. You can that when we don't use the * operator, we get an error very similar to the one you originally received.

A few more notes:

When writing Python, it really is best to stick to standard naming formats. For functions, use snake case (state_checker) and for classes use camel case. This helps you reason faster, amongst more esoteric reasons.

This function is probably misbehaving:

 def stateChecker(key, value):
     if (getCoord(key) is not None):
         return value

Assuming that getCoord returns four integers in a tuple (it's unclear in the original), it's type signature is:

def getCoord(key: Any) -> Tuple[int, int, int, int]:
    ....

That means, in turn, the type signature of stateChecker is:

def stateChecker(key: Any, value: Any) -> Union[None, Tuple[int, int, int, int]]:
    ....

In this case it is because if your if clause evaluates to false it will return None. It's likely getCoord can be short-circuited in these cases, but without knowing more it's hard to say how. Regardless, you aren't really handling a None return value.

Upvotes: 1

tdelaney
tdelaney

Reputation: 77347

map calls the target function with a single parameter. Use starmap to unpack an iterated tuple into parameters for the target function. Since your function is written to process key/value pairs, you can use the dictionary's item iterator to do the job.

def stateChecker(key, value):
    if (getCoord(key) is not None):
        return value
def stateRecognizer():
    with Pool(multiprocessing.cpu_count()) as mp_pool:
        return mp_pool.starmap(stateChecker, checks.items())

Upvotes: 1

Related Questions