Matt
Matt

Reputation: 207

Erlang consumer queue

I have a problem where I want to pull discrete chunks of data from disk into a queue, and dequeue them into another process. This data is randomly located on disk, so would not benefit substantially from sequential reads. It's alot of data so I can't load it all at once, nor is it efficient to pull in a block at a time.

I'd like the consumer to be able to operate at its own speed, but to keep a healthy queue of data ready for it so that I'm not constantly waiting on disk reads as I process chunks.

Is there an established way to do this? I.e with the jobs framework or safetyvalve? Implementing this feels like reinventing the wheel as a slow consumer operating on disk data is a common problem.

Any suggestions as to how best to tackle this the Erlang way?

Upvotes: 3

Views: 198

Answers (1)

I GIVE TERRIBLE ADVICE
I GIVE TERRIBLE ADVICE

Reputation: 9648

You can use the {read_ahead, Bytes} option on file:open/2:

{read_ahead, Size}

This option activates read data buffering. If read/2 calls are for significantly less than Size bytes, read operations towards the operating system are still performed for blocks of Size bytes. The extra data is buffered and returned in subsequent read/2 calls, giving a performance gain since the number of operating system calls is reduced.

The read_ahead buffer is also highly utilized by the read_line/1 function in raw mode, why this option is recommended (for performance reasons) when accessing raw files using that function.

If read/2 calls are for sizes not significantly less than, or even greater than Size bytes, no performance gain can be expected.

You've been vague on the sizes you mentioned using, but it seems that toying with that buffer size should be a decent start implementing what you need.

Upvotes: 1

Related Questions