Justin Grant
Justin Grant

Reputation: 46763

mechanical turk architecture for streaming an endless lists of tasks

How should we architect a solution that uses Amazon Mechanical Turk API to process a stream of tasks instead of a single batch of bulk tasks?

Here's more info:

Our app receives a stream of about 1,000 photos and videos per day. Each picture or video contains 6-8 numbers (it's the serial number of an electronic device) that need to be transcribed, along with a "certainty level" for the transcription (e.g. "Certain", "Uncertain", "Can't Read"). The transcription will take under 10 seconds per image and under 20 seconds per video and will require minimal skill or training.

Our app will get uploads of these images continuously throughout the day and we want to turn them into numbers within a few minutes. The ideal solution would be for us to upload new tasks every minute (under 20 per minute during peak periods) and download results every minute too.

Two questions:

Apologies for the newbie questions, we're new to Mechanical Turk.

Upvotes: 2

Views: 355

Answers (1)

Chris Conley
Chris Conley

Reputation: 1062

Streaming tasks one at a time to Turk

You can stream tasks individually through mechanical turk's api by using the CreateHIT operation. Every time you receive an image in your app, you can call the CreateHIT operation to immediately send the task to Turk.

You can also setup notifications through the api, so you can be alerted as soon as a task is completed. Turk Notification API Docs

Batching vs Streaming

As for batching vs streaming, you're better off streaming to achieve a good balance of turnaround time and cost. Batching won't drive down costs too much and improving accuracy is largely dependent on vetting, reviewing, and tracking worker performance either manually or implementing automated processes.

Libraries and Services

Most libraries offer all of the operations available in the api, so you can just google or search Github for a library in your programming language. (We use the Ruby library rturk)

A good list of companies that offer hosted solutions can be found under the Metaplatforms section of a answer on Quora to the question: What are some crowdsourcing services similar to Amazon Mechanical Turk? (Disclaimer: my company, Houdini is one of the solutions listed there.)

Upvotes: 1

Related Questions