kobame
kobame

Reputation: 5856

How the getcwd is implemented in the kernel (library)?

One process could do

chdir("/to/some/where");

when from the another shell

mv /to/some/where /now/different/path/

the 1st process

print getcwd(); 
#prints /now/different/path/

How the getcwd is implemented? (at the lowest level, e.g. at the level of kernel, inodes ...).

I know how common (inode based) filesystem works, e.g. what contains the directory (name of the entries and the corresponding inode numbers).

EDIT

Probably the question was to vague - trying to refine it. One possible scenario (from what o knows)

  1. the kernel knows the inode of the CWD for the given process (and his threads) - e.g. inode number 1000
  2. reads the inode (gets the blocks what needs to read)
  3. reads the corresponding blocks (e.g. opens the directory)
  4. read the directory entries (name of the entries and the inode numbers)
  5. gets the inode number for the .. parent directory (for example 900) and the inode number of the . (current directory)
  6. reads the content of the parent directory where gets
  1. continue to 5. - until the root inode is reached.

Thats mean, the getcwd for

/some/very/very/very/deep/directory/level

tooks more raw IO operations (more directory entries need to read) as for the short

/tmp

where the whole getcwd is done by two readings?

Is this correct? or it is done in totally another way?

Upvotes: 2

Views: 1805

Answers (3)

Gem Taylor
Gem Taylor

Reputation: 5635

Key point: chdir() only affects the current process and any child processes launched after that - it is not a global state.

Upvotes: 0

clt60
clt60

Reputation: 63972

First, you asking on the wrong place. This question is more about the operating system, so the unix.stackexchange is the better place.

Anyway, your proposed solution is true for some ancient UNIX implementation (for example BSD 2.8) or like. That pathname resolution could be done as you described.

However, many problems arises - few of them:

  • as you said - too complicated pathname resolution (and yes, for the deeper directories needs more IO)
  • depends on the premise that only ONE ROOT directory exists. This isn't true from the BSD 4.2 where are introduced the per process root directory - what allows the chroot system call - what allow sets the root to any directory without showing the real path to the process. (One of the coolest FreeBSD feature are the jails - depends on this) (Also ancient linuxes have only one root - only in the 0.96c are introduced the VFS - virtual filesystem layer)
  • and permission problems - e.g. what happens when
#shell1
$ mkdir -p /tmp/some
$ cd /tmp/some

second shell

$ su
# mkdir -p /tmp/my
# chmod 700 /tmp/my
# mv /tmp/some /tmp/my/

the /tmp/my directory isn't readable for the first process. So, it can't determine the path, so how it should work with the files? So, in shell1 again:

$ pwd
/tmp/some #the original
$ echo $CWD
/tmp/some
$ /bin/pwd
pwd: .: Permission denied

But, you still can do for example

$ touch bob #works

e.g. the system allows you work in the "current" directory without let you know where are you. (in both scenarios e.g. in chroot and in the second one) ;)

That's mean than every process stores in his table the current working directory:

  • device number (e.g. hdd1 or hdd2)
  • inode number on the device

and

  • the kernel maintains another global table(s), (in linux called as dentry (directory entries)), - where the kernel maintaining the "inode" -> "path" mapping for every process, every opened file descriptor, and also indode caches (in the linux maintained by the kernel itself, BSD: job for the vnod driver) and like.

E.g. when some process asks for the pathname for the inode X, the kernel searches the dentry table, if the entry found - return immediately, if not - calls the lookup process, what doing the pathname resolution.

When for example the rename occurs, the kernel searched the dentry table, if found the entry and changes it as needed.

All above is extremely simplified, as you can see yourself, all above is highly OS dependent, the common base is defined by POSIX - but happens behind (e.g. the implementation) - you need really read the sources of the kernel and/or google for:

  • linux dentry
  • linux vfs
  • freebsd vnode
  • pathname resolution

and such.

Ps: for the nitpickers, :) - as i said - everything is over-simplyfied, so if you want correct and add more details - edit the answer - i converted it to "community wiki answer".

Upvotes: 1

In current POSIX kernels like Linux (or *BSD-s) the current working directory (as a kernel inode) is part of the process state. So the in-kernel process descriptor (probably some struct task_struct on Linux) contains or refers to that cwd. Then getcwd is "simply" a syscall querying that.

The kernel inodes (for opened file descriptors, including working directories) are related to filesystems and are not the same as disk inodes.

Of course, the evil is in the details!

Upvotes: 0

Related Questions