Is it expected that new allocates extra space on the heap?

I wrote a small problem to display an example of stack and heap allocation and I am confused by the behavior.

int main()
    unsigned int s1,s2,s3;
    printf("Stack memory addresses: s1=0x%x\ts2=0x%x\ts3=0x%x\n", &s1, &s2, &s3);

    int *h1, *h2, *h3;
    h1 = new int;
    h2 = new int;
    h3 = new int;
    printf("Heap memory addresses: h1=0x%x\th2=0x%x\th3=0x%x\n", h1, h2, h3);

the output is this:

Stack memory addresses: s1=0xb90aa73c   s2=0xb90aa738   s3=0xb90aa734
Heap memory addresses: h1=0x24c4030 h2=0x24c4050    h3=0x24c4060

What I'm seeing on the stack is exactly what I expected, but I am confused about the heap. It appears to be allocating 32 bytes for the first integer and 16 for the second. I compiled again with optimizations turned off '-O0' and I get this.

Stack memory addresses: s1=0xbde7b73c   s2=0xbde7b738   s3=0xbde7b734
Heap memory addresses: h1=0x1318000 h2=0x1318010    h3=0x1318020

With optimizations off it sometimes only allocates 16 bytes per integer but it's not consistent. Sometimes it behaves the same without optimizations.

My first question was are extra bytes actually being allocated or are there elements being allocated on the heap behind the scenes that i am not seeing? The second question, regardless of why this memory is allocated, why is it inconsistent.

To try and attempt to answer my question I compiled to asm with no optimizations and got this:

    .section    __TEXT,__text,regular,pure_instructions
    .build_version macos, 15, 0 sdk_version 15, 2
    .globl  _main                           ## -- Begin function main
    .p2align    4, 0x90
_main:                                  ## @main
## %bb.0:
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset %rbp, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register %rbp
    subq    $48, %rsp
    leaq    L_.str(%rip), %rdi
    leaq    -4(%rbp), %rsi
    leaq    -8(%rbp), %rdx
    leaq    -12(%rbp), %rcx
    movb    $0, %al
    callq   _printf
    movl    $4, %edi
    callq   __Znwm
    movq    %rax, -24(%rbp)
    movl    $4, %edi
    callq   __Znwm
    movq    %rax, -32(%rbp)
    movl    $4, %edi
    callq   __Znwm
    movq    %rax, -40(%rbp)
    movq    -24(%rbp), %rsi
    movq    -32(%rbp), %rdx
    movq    -40(%rbp), %rcx
    leaq    L_.str.1(%rip), %rdi
    movb    $0, %al
    callq   _printf
    xorl    %eax, %eax
    addq    $48, %rsp
    popq    %rbp
                                        ## -- End function
    .section    __TEXT,__cstring,cstring_literals
L_.str:                                 ## @.str
    .asciz  "Stack memory addresses: s1=0x%x\ts2=0x%x\ts3=0x%x\n"

L_.str.1:                               ## @.str.1
    .asciz  "Heap memory addresses: h1=0x%x\th2=0x%x\th3=0x%x\n"


I don't pretend to really understand asm, but from my limited understanding there were three calls to new to allocate 4 bytes. Each followed by an 8 byte memory address being passed to the base pointer. When I compile from here with: g++ test_O0.s -o test_O0

Once again I get either 16 or 32 bytes between values. It appears that the new call is doing this. Why does this happen?

Stack memory addresses: s1=0xb573a72c   s2=0xb573a728   s3=0xb573a724
Heap memory addresses: h1=0xa68030  h2=0xa68050 h3=0xa68060

