Reputation: 1
I wanted to learn about cloud run job's task parallelism and , so I created a job and set number of task =10, and , to each container I am giving 8Gb of Memory and 8cpus's.
I have set task parallelism to "Run as many tasks concurrently as possible".
But, when I run this job, it run's 10 task one after other , not 10 task in parallel. Why , its not running 10 tasks in parallel.
Cloud run jobs running 10 task one after other
I have file of 1.6 gb,and below is my code, which I am running on cloud run instances,to process file and load data to BQ, just for testing.
import numpy as np
import os
import pandas as pd
project_id = os.environ.get("PROJECT_ID")
task_index = int(os.environ.get("CLOUD_RUN_TASK_INDEX"))
nb_task = int(os.environ.get("CLOUD_RUN_TASK_COUNT"))
print(nb_task)
df_original=pd.read_csv("gs://dataproc-input-321/train.csv")
df_len=len(df_original)
print(df_len)
batch_size=df_len//nb_task
print("--------------batch--------------")
print(batch_size)
# print("batch size {}".format(str(batch_size)))
print(f"batch size, {batch_size}!")
start_row_no = int(batch_size * task_index)
end_row_no = int(batch_size * (task_index + 1) - 1)
print(f"For task_id {task_index} , start is {start_row_no} and end is {end_row_no}")
df_sliced=df_original.iloc[start_row_no:end_row_no]
del(df_original)
df_sliced["type"]=np.where(df_sliced["PRODUCT_LENGTH"]%2==0,"even","odd")
df_even=df_sliced[df_sliced["type"] == 'even']
df_odd=df_sliced[df_sliced["type"] == 'odd']
df_even.to_gbq('output_dataset.amazon_even',
project_id=project_id,
if_exists='append'
)
df_odd.to_gbq('output_dataset.amazon_odd',
project_id=project_id,
if_exists='append'
)
So, I think , 8GB memory and 8 cpu are enough for processing 1.6 GB. But,why its not running everything in paraller, if each instance is getting 8Gb memory and 8 cpu's.
There is one option to limit the parallilism called "Limit the number of concurrent tasks", but when I try to enter any value greater than 0,it throws error- "Must be no higher than 0 for selected CPU and memory on the region"
I tried to change region ,but its not letting me set parallalism.
Can anyone explain me,
Thanks in advance.
Upvotes: 0
Views: 154