Delayed Decorator in Dask Library - results are counter productive

Question

Trying to learn how to use dask library and followed the link https://www.machinelearningplus.com/python/dask-tutorial/

Code with dask delayed decorator

import time
import dask
from dask import delayed

@delayed
def square(x):
    return x*x

@delayed
def double(x):
    return x*2

@delayed
def add(x, y):
    return x + y

@delayed
def sum(output):
    sum = 0
    for i in output:
        sum += i

    return sum

t1 = time.time()

# For loop that calls the above functions for each data
output = []
for i in range(99999):
    a = square(i)
    b = double(i)
    c = add(a, b)
    output.append(c)

total = dask.delayed(sum)(output)
print(total)

print("Elapsed time: ", time.time() - t1)

Elapsed time : ~8.46s

Normal code without any dask / decorator

import time

def square(x):
    return x*x

def double(x):
    return x*2

def add(x, y):
    return x + y

def sum(output):
    sum = 0
    for i in output:
        sum += i

    return sum

t1 = time.time()

# For loop that calls the above functions for each data
output = []
for i in range(99999):
    a = square(i)
    b = double(i)
    c = add(a, b)
    output.append(c)

total = sum(output)
print(total)

print("Elapsed time: ", time.time() - t1)

Elapsed time : ~0.043s

Both the code variants are executed on,

Windows machine

4 cores

8 logical cores

Python 3.11.0

dask version 2023.6.0

Shouldn't the code with @delayed decorator from dask perform better when compared to the other variant where functions are executed in serial order? Is it overhead in identifying the tasks to be executed in parallel or serial via task graph making it counterproductive? Was wondering if the iteration count is too minuscule to realize the benefits of dask library, tried increasing the value, and it is still the same.

Can someone please clarify it?

Delayed Decorator in Dask Library - results are counter productive

Answers (1)

Related Questions