Reputation: 1055
In my Cortex-M4, I have am using a 8Mhz oscillator as HSE, which then gets multiplied to 72Mhz using PLL which then drives SYSCLK. This got me thinking, which clock is the one being used to execute instructions? In other words, if our CPI is 1 (an ideal value, of course), does that mean we would execute 8 million instructions per second or 72 million instructions per second?
I also found this DWT which can be used to measure clock cycles, and hence CPI. So I am guessing which ever clock that is used to execute instructions would be the same one used by DWT?
Upvotes: 3
Views: 1606
Reputation: 68013
It is driven from HCLK (not SYSCLK which clocks system timer and it does not have to be equal to HCLK). Thew source of HCLK is settable by the programmer.
if our CPI is 1 (an ideal value, of course), does that mean we would execute 8 million instructions per second or 72 million instructions per second?
You can see how many cycles every instruction takes: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0439b/CHDDIGAC.html
The real speed depends on many factors but mainly depends on the place where your code and data reside and the advanced uC features.
If you execute your code fro the internal TCM SRAM and place data in the SRAM (or even better on some uC in TCI and TCD SRAM)you can archive the theoretical execution efficiency as those memories work at the core clock frequency with no wait states or bus waitstates. Ideally if the uC has TC memory and both instructions and data are fetched using separate buses.
If your code resides in the FLASH memory - this memory may introduce some wait states. STM uC (ART accelerator) read the flash in larger a chunks and fetch the instructions ahead. It allows those uCs to perform almost at the max speed. The problem are branch instructions which require pipeline to be flushed and instructions fetched again.
Upvotes: 5