swaechter
swaechter

Reputation: 1439

Inner working of the C standard library

I am interested in the inner working of the standard C library. I found a good book about a possible implementation - but I am looking for a deeper explanation of the whole standard library and the standards (like POSIX) - the definition of these standards in the standard library.

The C drafts are very helpful but not very nice to read. Is there other literature about this topic?

Albertus

Upvotes: 7

Views: 716

Answers (2)

R.. GitHub STOP HELPING ICE
R.. GitHub STOP HELPING ICE

Reputation: 215193

A good starting point would be POSIX. The POSIX 2008 specification is available online here:

http://pubs.opengroup.org/onlinepubs/9699919799/

It's more accessible (but sometimes less rigorous) than the C standard, and covers a lot more than just the C standard, i.e. most of the standardized parts of Unix-like systems' standard libraries.

If you're interested in implementations, the first thing to be aware of is that the POSIX-described behavior is usually split (by necessity and pragmatic reasons) between the kernel implementation and the userspace libc implementation. A large number of the functions in POSIX (and a few from the C standard) will merely be wrappers for "system calls", i.e. transitions into kernelspace to service the request. On some libc implementations, even finding these wrappers will be difficult, since they're often either automatically generated by the build scripts, and/or unified into a single assembly-language file.

The major (significant amount of non-kernel code) subsystems of the standard library are generally:

  • stdio: On glibc, this is implemented by the GNU libio library, which is a unified implementation of C stdio and C++ iostream, optimized so that neither has to be slowed down by being a wrapper for the other. It's a big hack, and the code is difficult to find and follow. Other implementations (especially the BSDs, but also other libcs on Linux) are much simpler and clearer to read. Ultimately they're based on the underlying file-descriptor IO functions like open, read, etc.
  • POSIX threads: On glibc and modern uClibc, this is NPTL. I'm not familiar with the BSDs' thread implementations. Other Linux libcs either lack threads or provide their own implementations based mainly on Linux clone and futex syscalls.
  • Math library: ultimately, almost all of these are based on the old Sun math code from the early 90s, but they've diverged a lot. Fdlibm is a pretty good base approximation of the code used in modern libcs.
  • User, group, hostname (DNS), etc. lookups: This is handled through libnss in glibc, and directly in most other libcs.
  • Regular expression and glob matching
  • Time and timezone handling
  • Locale and charset conversion
  • Malloc

If you want to get started reading sources, I would recommend not starting with glibc. It's very large and unwieldy. If you do want to read glibc, be aware that lots of the code is hiding under the sysdeps trees and is organized based on the diversity of systems it's applicable to.

Dietlibc is quite readable, but if you read its source, be aware that it's full of common C programming mistakes (e.g. using int where size_t is needed, not checking for overflows, etc.). If you keep this in mind, it might not be a bad choice, since ignoring lots of possible errors/failures tends to make the code very simple.

With that said, for reading libc source, I would most recommend either one of the BSDs or musl (disclaimer: I am the primary author of musl so I am a bit biased here). BSDs also have the advantage that the kernelspace code is also extremely simple and readable, so if you want to read the kernel code on the other side of a system call, you can do that too.

Upvotes: 6

ouah
ouah

Reputation: 145829

In "C: A Reference Manual, Fifth Edition" by Harbison & Steele, the second part of the book is dedicated to the C Standard library (Part 2: chapters 10-24).

http://careferencemanual.com

The Rationale document for C99 didn't cover the C library but the ANSI C89 Rationale covers in its chapter 4. There is a copy of the document here:

http://www.lysator.liu.se/c/rat/title.html

Upvotes: 5

Related Questions