Reputation: 644
GCC offers a number of options related to instruction scheduling in the compiler flags. An overview of what this means is on the GCC Wiki for "Instruction Scheduling", but this is well out of date (last updated in 2008). I understand the general idea: allow the compiler to re-order instructions to avoid stalls / achieve better CPU resource usage. The details on the knobs to control this are severely lacking and highly technical, though.
Here's the options that (I think) relate to instruction scheduling, from the current GCC Optimization Options:
-fmodulo-sched
-fmodulo-sched-allow-regmoves
, -freschedule-modulo-scheduled-loops
-fschedule-insns
, -fschedule-insns2
-fsched-pressure
-fsched-spec-load
, -fsched-spec-load-dangerous
, -f[no-]sched-spec
-fsched2-use-superblocks
-f[no-]sched-interblock
, -fsched-stalled-insns
, -fsched-stalled-insns-dep
, -fsched-*-heuristic
-fselective-scheduling
, -fselective-scheduling2
-fsel-sched-pipelining
, -fsel-sched-pipelining-outer-loops
-frename-registers
, -fschedule-fusion
, -param sched-pressure-algorithm=?
And, of course, the multiple optimization level (-O2
, -O3
, -Os
etc) also enable some of these by default.
Time for questions.
Upvotes: 1
Views: 244
Reputation: 622
There are two instruction scheduling passes in GCC. One performs before the register allocation (RA) pass, and the other performs after. Their corresponding flags are -fschedule-insns
and -fschedule-insns2
.
(Update 2024-11-14)
From the comment in gcc/sched-rgn.cc:
This pass implements list scheduling within basic blocks. It is run twice: (1) after flow analysis, but before register allocation, and (2) after register allocation.
(The reason for doing instruction scheduling twice is that, instruction passes tries to utilize all hardware resource as much as possible. But the optimization might cause register spill when doing RA. If RA is performed before instruction scheduling, there might have too less room for it. This is one of the the phase-ordering problem in compiler. The compilers I know (GCC, LLVM) performs instruction scheduling twice to mitigate the issue.)
This section tries to answer "What is the difference between these?".
There are different instruction scheduling algorithms:
Each scheduler has parameters to control its behavior, for example -fmodulo-sched-allow-regmoves
tries to do aggressive register movings when performing modulo scheduling. To understand all the flags, one might need to understand the algorithms first and trace the source code for those uses if necessary.
This section tris to answer "What are the limits of enabling these". But since there are no clear documents, most of the following are guessing :p.
I guess the first 3 scheduler (block, interblock and selective) are mutual exclusive. Since they are doing similar things with different heuristics. Options like -fselective-scheduling
, -fselective-scheduling2
, -fno-sched-interblock
are used to control which scheduler to be used in the first or second instruction scheduling pass. (And those options seem to mutually exclusive with each others.)
(Update 2024-11-14) From the source code in gcc/sched-rgn.cc, these 3 are mutual exclusive:
/* Run second scheduling pass after reload. */
static unsigned int
rest_of_handle_sched2 (void)
{
#ifdef INSN_SCHEDULING
if (flag_selective_scheduling2
&& ! maybe_skip_selective_scheduling ())
run_selective_scheduling ();
else
{
/* Do control and data sched analysis again,
|and write some more of the results to dump file. */
if (flag_sched2_use_superblocks)
schedule_ebbs ();
else
schedule_insns ();
}
#endif
I guess the the default scheduler for the first pass (before RA) is the interblock scheduler, since there is an option -fno-sched-interblock
to not to use interblock scheduler.
I guess the default scheduler for the second pass (after RA) is the block scheduler , since we have -fsched2-use-superblocks
option.
The modulo scheduler is targeting inner loops so can be performed along with other schedulers. But this is not enabled by default, have to use -fmodulo-sched
to enable it.
"Where can I find more documentation about this?"
Sadly I found nothing nigher. InstructionScheduling provides a quick overview but it is out of date. Optimization options provides some hints about the schedulers, but not so clear.
I tried to search the web, read the comments and codes in the GCC source tree, and use git blame to find more informations. Also reading compiler books to have some background knowledge for instruction scheduling. This answer is all I have ...
gcc/haifa-sched.cc
: contains shared codes between schedulers.
(Although InstructionScheduling doc says the block scheduler is implemented in haifa-sched.cc. But I did some archeology and there seems some refactors had been performed, like this commit and this commit.)
gcc/sched-rgn.cc
: the entry point of instruction scheduling pass.
gcc/sched-ebb.cc
: codes of superblock (ebb) scheduler.
gcc/sel-sched*
: code of selective scheduler.
gcc/modulo-sched.cc
: code of modulo scheduler.
Upvotes: 1