alex
alex

Reputation: 970

Shared numpy array with multithreading

I have a numpy array(matrix), which I want to fill with calculated values in asynchronously. As a result, I want to have matrix distances with calculated values, but at the end I receive matrix filled with default(-1) value. I understand, that something wrong with sharing distances between threads, but I can't figure out what's exactly wrong.

import numpy as np
import concurrent.futures

data = range(1, 10)
amount = len(data)
default = -1
distances = np.full((amount, amount), default, dtype=np.float32)


def calculate_distance(i, j):
    global distances
    if i == j:
        distances[i][j] = 0
    else:
        calculated = data[i] + data[j] #doesn't matter how is this calculated
        distances[i][j] = calculated
        distances[j][i] = calculated


with concurrent.futures.ProcessPoolExecutor() as executor:
    for i in range(0, amount):
        for j in range(i, amount):
            future = executor.submit(calculate_distance, i, j)
            result = future.result()

executor.shutdown(True)
print(distances)

Upvotes: 1

Views: 1306

Answers (1)

donkopotamus
donkopotamus

Reputation: 23236

You are using a ProcessPoolExecutor. This will fork new processes for performing work. These processes will not share memory, each instead getting a copy of the distances matrix.

Thus any changes to their copy will certainly not be reflected in the original process.

Try using a ThreadPoolExecutor instead.

NOTE: Globals are generally viewed with distaste ... pass the array into the function instead.

Upvotes: 1

Related Questions