bsmarcosj
bsmarcosj

Reputation: 1720

Is there any form to reduce the quantity of messages read per second from PubSubIO?

I have a cloud streaming pipeline that read from PubSubIO and which "PipelineOptions" are set with "WorkerMachineType = n1-standard-1". This machine have 3.75GB of memory.

My problem is that if the subscription has a lot of messages, the pipeline reads really fast and when starts to process many elements it doesn't have enough memory.

Is there any form to reduce the quantity of messages read per second? or is the memory consumption related with the time duration assigned to the window and I would reduce this time duration?

Thanks is advance.

Upvotes: 0

Views: 152

Answers (1)

Tyler Akidau
Tyler Akidau

Reputation: 206

It sounds like you may be trying to process too much data with too few workers. We are looking at addressing this and related scenarios, but in the meantime you may want to try dialing down the amount of data you're ingesting, or increasing the number of workers available to the jobs.

You'll also get better performance with n1-standard-4 machines, which is why we make those the default for the streaming runner.

Upvotes: 1

Related Questions