Tazmanian Tad
Tazmanian Tad

Reputation: 313

Let a queue build up to a certain amount before processing

So let me give you an idea of what I'm trying to do:
I've got a program that records statistics, lots and lots of them, but it records them as they happen one at a time and puts them into an ArrayList, for example:
Please note this is an example, I'm not recording these stats, I'm just simplifying it a bit

User clicks -> Add user_click to array
User clicks -> Add user_click to array
Key press -> Add key_press to array

After each event(clicks, key presses, etc) it checks the size of the ArrayList, if it is > 150 the following happens:
A new thread is created
That thread is given a copy of the ArrayList
The original ArrayList is .clear()'ed
The new thread combines similar items so user_click would now be one item with a quantity of 2, instead of 2 items with a quantity of 1 each
The thread processes the data to a MySQL db

I would love to find a better approach to this, although this works just fine. The issue with threadpools and processing immediately is there would be literally thousands of MySQL queries per day without combining them first..

Is there a better way to accomplish this? Is my method okay?
The other thing to keep in mind is the thread where events are fired and recorded can't be slowed down so I don't really want to combine items in the main thread.

If you've got code examples that would be great, if not just an idea of a good way to do this would be awesome as-well!

For anyone interested, this project is hosted on GitHub, the main thread is here, the queue processor is here and please forgive my poor naming conventions and general code cleanliness, I'm still(always) learning!

Upvotes: 1

Views: 64

Answers (1)

Andreas
Andreas

Reputation: 159114

The logic described seems pretty good, with two adjustments:

  • Don't copy the list and clear the original. Send the original and create a new list for future events. This eliminates the O(n) processing time of copying the entries.

  • Don't create a new thread each time. Events are delayed anyway, since you're collecting them, so timeliness of writing to database is not your major concern. Two choices:

    • Start a single thread up front, then use a BlockingQueue to send list from thread 1 to thread 2. If thread 2 is falling behind, the lists will simply accumulate in the queue until thread 2 can catch up, without delaying thread 1, and without overloading the system with too many threads.

    • Submit the job to a thread pool, e.g. using an Executor. This would allow multiple (but limited number of) threads to process the lists, in case processing is slower than event generation. Disadvantage is that events may be written out of order.

For the purpose of separation of concern and reusability, you should encapsulate the logic of collecting events, and sending them to thread in blocks for processing, in a separate class, rather than having that logic embedded in the event-generation code.

That way you can easily add extra features, e.g. a timeout for flushing pending events before reaching normal threshold (150), so events don't sit there too long if event generation slows down.

Upvotes: 5

Related Questions