Reputation: 31271
I need to access unaligned values using GCC vector extension
The program below crashes - in both clang and gcc
typedef int __attribute__((vector_size(16))) int4;
typedef int __attribute__((vector_size(16),aligned(4))) *int4p;
int main()
{
int v[64] __attribute__((aligned(16))) = {};
int4p ptr = reinterpret_cast<int4p>(&v[7]);
int4 val = *ptr;
}
However if I change
typedef int __attribute__((vector_size(16),aligned(4))) *int4p;
to
typedef int __attribute__((vector_size(16),aligned(4))) int4u;
typedef int4u *int4up;
The generated assembly code is correct (using unaligned load) - in both clang and gcc.
What is wrong with single definition or what do I miss? Can it be the same bug in both clang and gcc?
Note: it happens in both clang and gcc
Upvotes: 3
Views: 1534
Reputation: 13467
You've altered the alignment of the pointer type itself, not the pointee type. This has nothing to do with the vector_size
attribute and everything to do with the aligned
attribute. It's also not a bug, and it's implemented correctly in both GCC and Clang.
From the GCC documentation, § 6.33.1 Common Type Attributes (emphasis added):
aligned
(alignment)This attribute specifies a minimum alignment (in bytes) for variables of the specified type. [...]
The type in question is the type being declared, not the type pointed to by the type being declared. Therefore,
typedef int __attribute__((vector_size(16),aligned(4))) *int4p;
declares a new type T that points to objects of type *T, where:
Meanwhile, § 6.49 Using Vector Instructions through Built-in Functions says (emphasis added):
On some targets, the instruction set contains SIMD vector instructions which operate on multiple values contained in one large register at the same time. For example, on the x86 the MMX, 3DNow! and SSE extensions can be used this way.
The first step in using these extensions is to provide the necessary data types. This should be done using an appropriate
typedef
:typedef int v4si __attribute__ ((vector_size (16)));
The
int
type specifies the base type, while the attribute specifies the vector size for the variable, measured in bytes. For example, the declaration above causes the compiler to set the mode for thev4si
type to be 16 bytes wide and divided into int sized units. For a 32-bit int this means a vector of 4 units of 4 bytes, and the corresponding mode of foo is V4SI.The
vector_size
attribute is only applicable to integral and float scalars, although arrays, pointers, and function return values are allowed in conjunction with this construct. Only sizes that are a power of two are currently allowed.
#include <stdio.h>
typedef int __attribute__((aligned(128))) * batcrazyptr;
struct batcrazystruct{
batcrazyptr ptr;
};
int main()
{
printf("Ptr: %zu\n", sizeof(batcrazyptr));
printf("Struct: %zu\n", sizeof(batcrazystruct));
}
Output:
Ptr: 8
Struct: 128
Which is consistent with batcrazyptr ptr
itself having its alignment requirement changed, not its pointee, and in agreement with the documentation.
I'm afraid you'll be forced to use a chain of typedef
's, as you have done with int4u
. It would be unreasonable to have a separate attribute to specify the alignment of each pointer level in a typedef
.
Upvotes: 8