Juraj
Juraj

Reputation: 103

Debugging inline ASM with LLDB - treat instructions as separate statements for the step command?

In LLDB, the step command steps over a whole asm{} block as a single "statement". Is there a way to make it treat each instruction separately so you don't have to use si to step instructions, and switch to disassembly mode to see your current position?


As a part of a course I am attending, we are writing inline asm in C. They are using Visual Studio on windows, to write and debug MSVS assembly. I wanted to get this up and running on my M2 Macbook (I can run a VM, but that is boring)

It was pretty easy to get the thing to compile. I use clang and copile like this:

clang -fasm-blocks -target x86_64-apple-macos main.c -g -O0

Here is an example code snippet:

#include <inttypes.h>
#include <stdio.h>

uint8_t count1(uint32_t x) {
    uint8_t out;
    __asm {
        mov edx, x
        mov al, 0
next:   cmp edx, 0
        je done
        mov cl, dl
        and cl, 1
        add al, cl
        shr edx, 1
        jmp next
done:
        mov out, al
    }
    return out;
}

int main() {
    uint32_t x = 0x5789ABCD;
    uint8_t cnt = count1(x);
    printf("Number of 1s in 0x%X: %hhu\n", x, cnt);
}

The only thing that I have to do differently, is the return value. I have to move it to a C variable manually and return it. If you know how to make this work implicitly, it would be awesome, but that isn't a big deal for me.

The problem came with debugging. I use neovim and have nice debugging experience with nvim-dap and codelldb. However when I step through the code snipped above, the asm block runs in one step. I have tried debugging with raw lldb on the cli and it does the same thing. I can step through the asm with si and di -F intel but that is a little cumbersome.

I suppose I am only missing a flag in the compilation step to generate debug symbols for the asm block. Does anyone know how to step through the __asm block within LLDB and consequently neovim? Is my understanding of the problem correct?

Upvotes: 3

Views: 179

Answers (1)

Juraj
Juraj

Reputation: 103

So, I found a solution. As it turns out, you can specify the locations manually. This is how that would look:

uint8_t count1(uint32_t x) {
    int out;
    __asm {
        .file 1 "main.c"
        .loc 1 9
        mov edx, [x]
        .loc 1 11
        mov al, 0
        .loc 1 13
next:   cmp edx, 0
        .loc 1 15
        je done
        .loc 1 17
        mov cl, dl
        .loc 1 19
        and cl, 1
        .loc 1 21
        add al, cl
        .loc 1 23
        shr edx, 1
        .loc 1 25
        jmp next
done:
        .loc 1 28
        mov out, al
    }
    return out;
}

Doing this manually would be very annoying, and modifying the source code would mess up how the source looks in the debugger (you would see the .loc ... declarations)

What we can do, is use clang to compile the c file into the intermediate LLVM IR, modify that, either with a custom compiler pass or with a simple script (I chose the script), and then compile the IR into an executable:

clang -S -emit-llvm -o main.ll main.c -fasm-blocks -target x86_64-apple-macos -g -O0
python3 preprocess.py main.ll
clang main.ll -o a.out -fasm-blocks -target x86_64-apple-macos -g -O0

This creates an executable with all the debug symbols and the original source code, which is what I was after in the first place.

Here is the python script for anyone interested:

import sys

current_file_no = 0


def get_file_no():
    global current_file_no
    current_file_no += 1
    return current_file_no


def process_asm(asm_string: str, source_filename: str, first_line: int) -> str:
    instructions = asm_string.split("\\0A\\09")

    file_no = get_file_no()
    result = [f".file {file_no} \\22{source_filename}\\22"]

    loc_directives = [
        f".loc 1 {line}" for line in range(first_line, first_line + len(instructions))
    ]
    for loc, inst in zip(loc_directives, instructions):
        result.append(loc)
        result.append(inst)

    return "\\0A\\09".join(result)


asm_prefix = "call void asm sideeffect inteldialect "


def main():
    filename = sys.argv[1]

    with open(filename, "r") as f:
        lines = f.readlines()

    source_filename = None

    result = []
    for line in lines:
        stripped_line = line.strip()

        if stripped_line.startswith("source_filename"):
            source_filename = stripped_line.split('"')[1]
            result.append(line)
            continue

        if stripped_line.startswith("call void asm sideeffect inteldialect "):
            start = line.find(asm_prefix) + len(asm_prefix) + 1
            end = start
            while line[end] != '"':
                end += 1
            asm_string = line[start:end]

            dbg_entry = line.split("!dbg ")[1].split(",")[0]
            di_location = [ln for ln in lines if ln.startswith(dbg_entry)][0]
            line_number = int(di_location.split("line: ")[1].split(",")[0])

            assert source_filename is not None
            new_asm = process_asm(asm_string, source_filename, line_number + 1)
            result.append(line[:start] + new_asm + line[end:])
            continue

        result.append(line)

    with open(filename, "w") as f:
        f.write("".join(result))


if __name__ == "__main__":
    main()

Upvotes: 0

Related Questions