Reputation: 23
I've followed the instructions on Cloud TPU Tools. Except for step 4 where you have to change --tpu_name to --tpu, things seem to work as expected.
What failed is the generation of the "Profile" tab. I executed
capture_tpu_profile --tpu_name=$TPU_NAME --logdir=${model_dir}
which produced
Welcome to the Cloud TPU Profiler v1.6.0
Starting to profile TPU traces for 2000 ms. Remaining attempt(s): 3
Limiting the number of trace events to 1000000
Profile session succeed for host(s):10.240.1.2
I refreshed/restarted the TensorBoard multiple times, but there's no "Profile" tab and clicking "Profile" from the dropdown menu returns no data generated.
Is this a known issue with the Cloud TPU profiler?
--Edit 1--
Profiler v 1.5.2 failed at collecting trace events.
Welcome to the Cloud TPU Profiler v1.5.2
Starting to profile TPU traces for 2000 ms. Remaining attempt(s): 3
Limiting the number of trace events to 1000000
No trace event is collected. Automatically retrying.
Starting to profile TPU traces for 2000 ms. Remaining attempt(s): 2
Limiting the number of trace events to 1000000
No trace event is collected. Automatically retrying.
Starting to profile TPU traces for 2000 ms. Remaining attempt(s): 1
Limiting the number of trace events to 1000000
No trace event is collected after 3 attempt(s). Perhaps, you want to try again (with more attempts?).
Tip: increase number of attempts with --num_tracing_attempts.
Upvotes: 1
Views: 895
Reputation: 36
Can you try again using Cloud TPU Profiler 1.5.2?
pip install cloud-tpu-profiler==1.5.2
The Cloud TPU profiler 1.6.0 and the worker list feature is only supported in the current master branch of tensorflow, while it is backward compatible to tf-1.8 when using the following command capture_tpu_profile —service_addr=10.240.1.2 —logdir=${model_dir}
Upvotes: 1