Jan Tušil
Jan Tušil

Reputation: 978

x86-64 System V abi - argument classification for parameter passing

The x86_64 System V ABI in Section 3.2.3 specifies which arguments of a function call go to which registers and which are pushed on the stack. I have troubles understanding the algorithm for the classification of aggregates, which says (the highlighting is mine):

The classification of aggregate (structures and arrays) and union types works as follows:

  1. If the size of an object is larger than eight eightbytes, or it contains unaligned fields, it has class MEMORY.
  2. If a C++ object is non-trivial for the purpose of calls, as specified in the C++ ABI13, it is passed by invisible reference (the object is replaced in the parameter list by a pointer that has class INTEGER).
  3. If the size of the aggregate exceeds a single eightbyte, each is classified separately. Each eightbyte gets initialized to class NO_CLASS.
  4. Each field of an object is classified recursively so that always two fields are considered. The resulting class is calculated according to the classes of the fields in the eightbyte: (a) If both classes are equal, this is the resulting class. (b) If one of the classes is NO_CLASS, the resulting class is the other class.(c) If one of the classes is MEMORY, the result is the MEMORY class.(d) If one of the classes is INTEGER, the result is the INTEGER. (e) If one of the classes is X87, X87UP, COMPLEX_X87 class, MEMORY is used as class.(f) Otherwise class SSE is used.
  5. Then a post merger cleanup is done: (a) If one of the classes is MEMORY, the whole argument is passed in memory. (b) If X87UP is not preceded by X87, the whole argument is passed in memory. (c) If the size of the aggregate exceeds two eightbytes and the first eightbyte isn’t SSE or any other eightbyte isn’t SSEUP, the whole argument is passed in memory. (d) If SSEUP is not preceded by SSE or SSEUP, it is converted to SSE

I do not understand points (3), (4), and (5). Specifically, I have the following questions:

Q1. In point (3), by "each is classified separately", do authors mean "each eightbyte"? If so, then I would expect that what follows is an explanation of classification of eightbytes.

Q2. In point (4), by "Each field of an object", do they mean "each field of an eightbyte that is a result of (separation in) point (3)?

Q3. In point (4), by "two fields" in "always two fields are considered", do they mean two successive fields?

Q4. In point (4), by "the resulting class", do they mean the class of the object, or of the eightbyte, or of the second considered field, or of something else? In the last case, where is the resulting class used? Does this mean that the algorithm keeps the field of the first field as is, and then iteratively computes the class of the next field, until we have the classes of all fields in the eightbyte? Or does it mean that we the algorithm processes two fields at once?

Q5. In point (4), what if there is only one field? Or an even number of fields?

Q6. In point (5), "one of the classes" of a field, or of an eightbyte?

If someone could provide something more formal/precise - e.g., a pseudo-code or a flow-chart - that would be ideal.

Upvotes: 5

Views: 1646

Answers (2)

yyny
yyny

Reputation: 1736

To further clarify the relation between Point 3 and Point 4: Arguments and return values are passed as one or more eightbytes, and the ABI specifies how to classify these eightbytes, not the fields. However, aggregate types can contain multiple fields in a single eightbyte, or a single field spanning multiple eightbytes, and Point 4 specifies how to handle these cases.

Some examples:

struct big {
    __float128 value;
};

Here, the aggregate type is classified { SSE, SSEUP }, even though there is just a single field, because the aggregate type requires two eightbytes.

struct vec {
    float x;
    float y;
};

Here, the aggregate type is classified { SSE }. Even though there are multiple fields, there is only one classification, because the aggregate type fits in just a single eightbyte.

struct mixed {
    int x;
    float y;
};

Here, the aggregate type is classified { INTEGER }. The reason for this is Point 4d:

If one of the classes is INTEGER, the result is the INTEGER.

struct mixed {
    int x;
    int y;
    float z;
};

Here, the aggregate type is classified { INTEGER, SSE }. Note that unlike the previous example, the eightbyte containing the float field does not become INTEGER, because that eightbyte no longer contains any INTEGER fields.

Upvotes: 0

mkayaalp
mkayaalp

Reputation: 2716

See the gcc implementation.

Clarification for Point 1 (in response to the comment saying "eight is a typo and should be two instead"):

  1. If the size of an object is larger than eight eightbytes, or it contains unaligned fields, it has class MEMORY.
      /* On x86-64 we pass structures larger than 64 bytes on the stack.  */
      if (bytes > 64)
        return 0;

The function returns the number of registers to use for parameters, and zero means memory should be used instead.

(Later, after the analysis, if there are more than two eightbytes, registers are used only if the first is SSE and the rest are SSEUP, as pointed out in 5.(c):

(c) If the size of the aggregate exceeds two eightbytes and the first eightbyte isn't SSE or any other eightbyte isn't SSEUP, the whole argument is passed in memory.)


Q1. In point (3), by "each is classified separately", do authors mean "each eightbyte"?

Yes. In the code, each eightbyte is called a word.

Each eightbyte gets initialized to class NO_CLASS.

  int words = CEIL (bytes + (bit_offset % 64) / 8, UNITS_PER_WORD);
  // ...
      for (i = 0; i < words; i++)
        classes[i] = X86_64_NO_CLASS;

Q2. In point (4), by "Each field of an object", do they mean "each field of an eightbyte that is a result of (separation in) point (3)?

No, they mean each field of a struct/class, or union, or array elements. These are handled in a couple of places in the code, but you will see for loops like:

          for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))

This is why it is recursive. The fields themselves can be aggregate types. The entire logic is applied starting with each field, and the recursive function:

  • either returns 0, meaning the whole thing is passed in memory,
  • or it returns the number of registers (eightbytes) that would be used and the classes of each one (recursion through the nested fields will terminate at fields with non-aggregate types).
                      num = classify_argument (TYPE_MODE (type), type,
                                               subclasses,
                                               (int_bit_position (field)
                                               + bit_offset) % 512);
                      if (!num)
                        return 0;

Q3. In point (4), by "two fields" in "always two fields are considered", do they mean two successive fields?

I don't think "fields" is accurate here. And not successive. What it is doing is merging the classes determined so far for each word, with the classes recursively determined for the fields that correspond to the same words. See below:

                      pos = (int_bit_position (field)
                            + (bit_offset % 64)) / 8 / 8;
                      for (i = 0; i < num && (i + pos) < words; i++)
                        classes[i + pos]
                          = merge_classes (subclasses[i], classes[i + pos]);

Starting from pos (the eightbyte that this field is in) each class gets merged with the subclass determined by the recursive call for the field.


Q4. In point (4), by "the resulting class", do they mean the class of the object, or of the eightbyte, or of the second considered field, or of something else?

This is now describing the merge_classes function, which takes the two classes and returns the merged class for the eightbyte. We are iterating over fields but classes are for eightbytes.

In the last case, where is the resulting class used?

Each class will determine the type of the corresponding register (GPR/SSE/X87 etc.).


Q5. In point (4), what if there is only one field? Or an even number of fields?

I hope "two field" is answered at this point. If, say, a struct has one field, the class will be initialized for that eightbyte as NO_CLASS, then for the field it will be determined as, say, INTEGER. Then at merge, the class will become INTEGER.


Q6. In point (5), "one of the classes" of a field, or of an eightbyte?

Of an eightbyte. Classes are always referring to an eightbyte.

Upvotes: 4

Related Questions