assemblycompiler-constructioncompiler-optimization

Reputation: 751

Why do we even need assembler when we have compiler?

If compiler converts high-level language to machine code, why do we even need assembler? Are there any assembly level language and we can't use compiler for that?

Upvotes: 6

Answers (5)

artless-noise-bye-due2AI

Reputation: 22395

TL;DR - if compiler and debugger writers are perfect you probably don't need assembler for application programming. However, your fundamental understanding of computing will not be complete. You will loose the ability to step outside the box.

An assembler tries to map one-to-one the machine mnemonics to the underlying binary op-codes. As such it is the most expressive language to a particular machine. Some languages attempt to hide 'pointers' or memory addresses. All languages hide register allocation and mapping variables to either stack slots or physical registers. The job of the optimizing compiler is to map the high level language to the underlying machine language. The algorithms used can be quite exhaustive as a computer can search a large number of solutions faster than a human and find an optimal solution.

The compiler 'fails' when it does not realize a machine concept will map a problem to the most effective solution. For instance, there is no idea of a 'carry bit' in 'C' and 'C++'. There are several solutions for arbitrary large number types. For problems involving large integers, it is useful to use the 'carry bit' to chain smaller integers into larger integer (number of bits). Compiler developer has realized this issue and implement various solutions. The most trivial is just to add more and more types (long long unsigned, etc). Some compilers will detect idioms in 'C' where a programmer is trying to use the high bit to chain to a low bit. For example,

/* a,b are two parts of one number.  
   c,d are two parts of another to be added.
 */
void add_big(uint *a, uint *b, const uint c,const uint d) {
  unsigned long long tmp;
  tmp = *b + d;
  if(tmp & CARRY_BIT)
    *a += c + 1;
  else
    *a += c;
  *b = (uint)tmp;
}

The complexity demonstrates how hard it can be to do this task which you want to be simple and efficient. In fact most machines allow this to map to only a few assembler instruction. The compiler writer needs to recognize that the pattern the user is using can collapse the several high level constructs to a few assembler instructions. Or, they provide a language escape to the lower level assembly.

Many debugging issues can only be solved more efficiently with knowledge of assembler and machine concepts. If you program in a higher level language such as Python, this will not be pertinent. But then you ultimately rely on other developer to make containers (lists, sets, dictionaries, numpy, etc) to create this code in some lower level language. Many efficient data structures can not be coded without memory addresses.

Even if you never use assembly language, the concepts will help you understand why code is slow. A high level language may mask many details about why things are/are not efficient. Often if you understand how the tool is mapping thing to assembler language your search towards an efficient solution is much faster.

For a security researcher, knowledge of assembler opcodes can be pretty fundamental to understand exploits. For an OS/systems programmer, there are many opcodes that will not map to a higher level language. For compilers and language authors, finding the best mapping to a problem set and ways to express this, you need to understand assembler; or even more machine architectures which includes nuances of memory access.

Ultimately a professional programmer will be confronted with proprietary code which has limitations. This code will not come with source. Often the most effective way to diagnose and overcome the issue is to examine the binary for issues. If you can not understand assembly language, you are stuck.

Upvotes: 2

Peter Cordes

Reputation: 363980

Related: Does a compiler always produce an assembly code? - more about why some compilers do compile only to asm, instead of straight to machine code in some object file format. There are several reasons that compiling to asm instead of machine code makes a compiler's job easier and a compiler more easily portable. But compilers aren't the only reason for asm existing.

why do we even need assembler?

Many people don't need to know assembly language.

It exists so we can talk about / analyze machine code, and write/debug compilers more easily.

Compilers have to be written by humans. As @old_timer points out, when designing a new CPU architecture, you always give names to the opcodes and registers so you can talk about the design with other humans, and publish readable manuals.

Or for OS development, some special privileged instructions can't be generated by compilers¹. And you can't write a context-switch function that saves registers in pure C.

CPUs run machine-code, not high-level languages directly, so computer security / exploits, and any serious low-level performance analysis / tuning of single loops require looking at the instructions the CPU is running. Mnemonic names for the opcodes are very helpful in thinking and writing about them. mov r32, imm32 is much easier to remember and more expressive than B8+rd imm32 (the range of opcodes for that mnemonic).

Footnote 1: Unless like MSVC you create intrinsics for all the special instructions like __invlpg() that OSes need to use, so you can write an OS without inline asm. (They still need some stand-alone asm for stuff like entry points, and probably for a context-switch function.) But then those intrinsics still need names in C so you might as well name them in asm.

I regularly use asm for easily creating the machine code I want to test for microbenchmarks. A compiler has to create efficient machine code, not just correct machine code, so it's common for humans to play around with asm to see exactly what's fast and what's not on various CPUs.

See http://agner.org/optimize/, and other performance links in the x86 tag wiki.

e.g. see Can x86's MOV really be "free"? Why can't I reproduce this at all? and Micro fusion and addressing modes for examples of micro-benchmarking to learn something about what's fast.

See C++ code for testing the Collatz conjecture faster than hand-written assembly - why? for more about writing asm by hand that's faster than what I could hand-hold gcc or clang into emitting, even by adjusting the C source to look more like the asm I came up with.

(And obviously I had to know asm to be able to look at the compiler's asm output and see how to do better. Compilers are far from perfect. Sometimes very far. Missed-optimization bugs are common. To think of new optimizations and suggest that compilers look for them, it's a lot easier to think in terms of asm instructions than machine code.)

Wrong-code compiler bugs also sometimes happen, and verifying them basically requires looking at the compiler output.

Stack Overflow has several questions like "what's faster: a++ or ++a?", and the answer completely depends on exactly how it compiles into asm, not on source-level syntax differences. To understand why some kinds of source differences affect performance, you have to understand how code compiles to asm.

e.g. Adding a redundant assignment speeds up code when compiled without optimization. (People often fail to realize that compiling with/without optimization isn't just a linear speedup, and that it's basically pointless to benchmark un-optimized code. Un-optimized code has different bottlenecks... This is obvious if you look at the asm.)

Upvotes: 10

well...

Reputation: 31

The compiler can translate the code written in high level language to a machine code but that's not the only language it can translate to. It can also translate the code into assembly language and more. SEE https://www.quora.com/Does-a-compiler-convert-code-to-assembly

However as mentioned in the above answers we can see why we generally use assembler after compiler.

Upvotes: 0

user8795381

Reputation:

Quoting from @TylerAndFriends's answer on Why do we need assembly language? on cs.SE (a duplicate of this):

Assembly language was created as an exact shorthand for machine level coding, so that you wouldn't have to count 0s and 1s all day. It works the same as machine level code: with instructions and operands.

Though it's true, you probably won't find yourself writing your next customer's app in assembly, there is still much to gain from learning assembly.

Today, assembly language is used primarily for direct hardware manipulation, access to specialized processor instructions, or to address critical performance issues. Typical uses are device drivers, low-level embedded systems, and real-time systems.

Assembly language is as close to the processor as you can get as a programmer so a well designed algorithm is blazing -- assembly is great for speed optimization. It's all about performance and efficiency. Assembly language gives you complete control over the system's resources. Much like an assembly line, you write code to push single values into registers, deal with memory addresses directly to retrieve values or pointers. (source: codeproject.com)

Upvotes: 4

Timothy Baldwin

Reputation: 3675

Some more examples:

Interacting with the interrupt handler to implement atomic opaerations such Linux's atomic operations on ARMv5 and earlier.
Call a system call only if the signal handler has not been called in QEMU linux-user.
Initalising a computer to a state that a compiler can be used, for example configuring the memory controller.
Entry and exit to/from interrupt handlers and system calls.

Upvotes: 3

Why do we even need assembler when we have compiler?

Answers (5)

Related Questions