Reputation: 167
Does GCC, when compiling for x86 or x86_64, make sure that the Direction Flag has some specific value before it starts executing an Extended Asm block? I couldn't find any information on that in the GCC documentation and there is no input operand of the extended asm block that could specify this.
When using some instructions with rep
prefix, the Direction Flag (DF) determines whether the %esi
and %edi
registers are incremented or decremented.
When we don't know state of the DF, we must make sure that it has the wanted direction by executing cld
/std
:
// Send 54 bytes from 0x1234 to 0x1234+53 to port 0x1234.
copy: cld
movl $0x1234, %esi
movw $0x1234, %dx
movl $54, %ecx
rep outsb
When used in C inline assembly with GCC, we would write something like this:
uint16_t io = 0x1234;
char const *buf = …;
size_t len = …;
asm volatile("cld\n\trep outsb"
: "+c"(len), "+S" (buf)
: "d" (io)
: "memory");
The code above assumes that the GCC leaves the DF in undefined state before executing the inline assembly. In every code I've written, it was set to zero, and therefore the cld
instruction was not required:
uint16_t io = 0x1234;
char const *buf = …;
size_t len = …;
asm volatile("rep outsb"
: "+c"(len), "+S" (buf)
: "d" (io)
: "memory");
This code also works. But I want to know whether it works just by coincidence (because most of the code uses DF cleared), or the GCC ensures that the DF is not set before executing the inline assembly.
(p.s. the "memory" clobber could be replaced with a dummy "m"
memory-source operand. See How can I indicate that the memory *pointed* to by an inline ASM argument may be used?. The count and pointer inputs need to be read/write or have a matching constraint with dummy variables so the compiler knows RCX and RSI don't keep their values. volatile
is necessary because writing an I/O port is a side-effect on something other than memory.)
Upvotes: 3
Views: 51
Reputation: 167
Not a definitive answer, but this shows that it's not even safe to leave the DF set after leaving the inline assembly statement. I would therefore assume that GCC relies on that the DF is cleared all the time.
#include <stdbool.h>
#include <stdio.h>
#include <stdint.h>
#include <unistd.h>
__attribute__((always_inline))
static inline uint32_t eflags()
{
uint64_t fl;
asm volatile ("pushfq\n\tpopq %0" : "=r" (fl));
return fl;
}
__attribute__((always_inline))
static inline bool df()
{
return eflags() & (1 << 10);
}
int main()
{
int a[50000] = { 1, 2, 3, 4, 5 };
int b[50000];
a[40000] = getpid();
printf("#1 DF = %c\n", df() ? '1' : '0');
#ifdef UWU
asm volatile("std");
#endif
printf("#2 DF = %c\n", df() ? '1' : '0');
for (size_t pos = 0; pos < 50000; ++pos) {
b[pos] = a[pos];
}
printf("#3 DF = %c\n", df() ? '1' : '0');
printf(" %d\n", b[1]);
printf(" %d\n", b[40000]);
}
Output for different compiler settings:
$ gcc a.c
$ ./a.out
#1 DF = 0
#2 DF = 0
#3 DF = 0
2
11510
$ gcc a.c -DUWU
$ ./a.out
#1 DF = 0
#2 DF = 1
#3 DF = 1
2
11516
$ gcc a.c -O1
$ ./a.out
#1 DF = 0
#2 DF = 0
#3 DF = 0
2
11707
$ gcc a.c -DUWU -O1
$ ./a.out
#1 DF = 0
#2 DF = 1
#3 DF = 1
2
11716
$ gcc a.c -O2
$ ./a.out
#1 DF = 0
#2 DF = 0
#3 DF = 0
2
11725
$ gcc a.c -DUWU -O2
$ ./a.out
#1 DF = 0
#2 DF = 1
#3 DF = 1
2
0
Upvotes: 2