Reputation: 3530
The batching guide for TensorFlow Serving suggests,
If your system is CPU-only (no GPU), then consider starting with the following values: num_batch_threads equal to the number of CPU cores; max_batch_size to infinity;
However, it's not clear what infinity means for max_batch_size.
Settings the value to zero or -1 seems to result in errors, so I'm considering setting it to 10000 just to make it far larger than any value I'm likely to try.
Still, the suggestion in the documentation that some "infinity" value exists is giving me restless sleep. How can I indicate infinity here?
Upvotes: 1
Views: 344
Reputation: 11631
Looking at the code source , and notably the file session_bundle_config.proto
that define the protobuf message BatchingParameters
, it does not seem like it is possible to provide an infinity value.
message BatchingParameters {
// SharedBatchScheduler options (see shared_batch_scheduler.h for more details
// about what each field means):
//
// The maximum size of each input batch.
//
// IMPORTANT: As discussed above, use 'max_batch_size * 2' client threads to
// achieve high throughput with batching.
google.protobuf.Int64Value max_batch_size = 1;
}
That BatchingParameters
protobuf message describe the possible options to pass in batching_parameters.txt
. It is parsed by the following line in model_servers/server.cc
:
if (server_options.batching_parameters_file.empty()) {
batching_parameters->mutable_thread_pool_name()->set_value(
"model_server_batch_threads");
} else {
TF_RETURN_IF_ERROR(ParseProtoTextFile<BatchingParameters>(
server_options.batching_parameters_file, batching_parameters));
}
I guess a replacement for an infinity value would be the maximum value for a int64, so 2^63 - 1
(9 223 372 036 854 775 807
)
Upvotes: 1