Seeking Advice on Energy Optimization(DVFS) Strategies

Question

I'm currently exploring an energy optimization problem but find myself unsure how to proceed effectively.

To reduce the energy consumption of a memory-bound task, one idea is to lower the computational frequency (or computational capability) of the device. This approach would decrease power consumption while having a minimal effect on latency, thereby reducing overall energy usage (E = P × t).

Does this idea hold merit?

If so, how can we determine the optimal trade-off point? In cases where power cannot be directly measured, could specific PMU events or metrics help identify the transition point between memory-bound and compute-bound as the computational frequency decreases? For instance, in CUDA, metrics such as smsp__average_warps_issue_stalled_long_scoreboard_per_issue_active.ratio and smsp__average_warps_issue_stalled_math_pipe_throttle_per_issue_active.ratio might be relevant.

Additionally, could a similar strategy be applied to CPUs? Alternatively, for compute-bound tasks, would reducing the EMC frequency (memory bandwidth) be an effective approach?

Seeking Advice on Energy Optimization(DVFS) Strategies

Answers (0)

Related Questions