Next: Identifiers
Up: Processes
Previous: Processes
So that Linux can manage the processes in the system, each process is represented by a
task_struct data structure
(task and process are terms which Linux uses interchangeably).
The task vector is an array of pointers to every task_struct data
structure in the system.
This means that the maximum number of processes in the system is limited by the
size of the task vector; by default it has 512 entries.
As processes are created, a new task_struct is allocated from system memory
and added into the task vector.
To make it easy to find, the current, running, process is pointed to by the current pointer.
As well as the normal type of process, Linux supports real time processes.
These processes have to react very quickly to external events (hence the term ``real
time'') and they are treated differently than normal user processes by the scheduler.
Although the task_struct data structure is quite large and complex, but its fields
can be divided into a number of functional areas:
- State
- As a process executes it changes state according to its circumstances.
Linux processes have the following states:
- Running
- The process is either running (it is the current process in the
system) or it is ready to run (it is waiting to be assigned to one of
the system's CPUs).
- Waiting
- The process is waiting for an event or for a resource. Linux
differentiates between two types of waiting process; interruptible
and uninterruptible. Interruptible waiting processes can be
interrupted by signals whereas uninterruptible waiting processes are waiting
directly on hardware conditions and cannot be interrupted under any
circumstances.
- Stopped
- The process has been stopped, usually by receiving a signal. A process
that is being debugged can be in a stopped state.
- Zombie
- This is a halted process which, for some reason, still has a
task_struct data structure in the task vector. It is what it
sounds like, a dead process.
- Scheduling Information
- The scheduler needs this information in order
to fairly decide which process in the system most deserves to run,
- Identifiers
- Every process in the system has a process identifier.
The process identifier is not an index into the
task vector, it is simply a number. Each process also has
User and group identifiers, these are used to control this processes access to the files and
devices in the system,
- Inter-Process Communication
- Linux supports the classic Unix IPC mechanisms of
signals, pipes and semaphores and also the System V IPC mechanisms of shared memory,
semaphores and message queues.
- Links
- In a Linux system no process is independent of any other process.
Every process in the system, except the initial process has a parent process.
New processes are not created, they are copied, or rather cloned from previous processes.
Every task_struct representing a process keeps pointers to its parent process
and to its siblings (those processes with the same parent process) as well as to its own child
processes.
You can see the family relationship between the running processes in a Linux system
using the pstree command:
init(1)-+-crond(98)
|-emacs(387)
|-gpm(146)
|-inetd(110)
|-kerneld(18)
|-kflushd(2)
|-klogd(87)
|-kswapd(3)
|-login(160)---bash(192)---emacs(225)
|-lpd(121)
|-mingetty(161)
|-mingetty(162)
|-mingetty(163)
|-mingetty(164)
|-login(403)---bash(404)---pstree(594)
|-sendmail(134)
|-syslogd(78)
`-update(166)
Additionally all of the processes in the system are held in a doubly linked list
whose root is the init processes task_struct data structure.
This list allows the Linux kernel to look at every process in the system.
It needs to do this to provide support for commands such as ps or kill .
- Times and Timers
- The kernel keeps track of a processes creation time as well as the
CPU time that it consumes during its lifetime. Each clock tick, the
kernel updates the amount of time in jiffies that the current process
has spent in system and in user mode.
Linux also supports process specific interval timers, processes can use system
calls to set up timers to send signals to themselves when the timers expire.
These timers can be single-shot or periodic timers.
- File system
- Processes can open and close files as they wish and the
processes task_struct
contains pointers to descriptors for each open file as well as pointers to two
VFS inodes.
The first is to the root of the process (its home directory) and the second is to its
current or pwd directory. pwd is derived from the Unix command pwd ,
print working directory.
These two VFS inodes have their count fields incremented to show that
one or more processes are referencing them. This is why you cannot delete
the directory that a process has as its pwd directory set to, or for that
matter one of its sub-directories.
- Virtual memory
- Most processes have some virtual memory (kernel threads and
daemons do not) and the Linux kernel must track how that virtual memory is
mapped onto the system's physical memory.
- Processor Specific Context
- A process could be thought of as the sum total of the
system's current state.
Whenever a process is running it is using the processor's registers, stacks and
so on.
This is the processes context and, when a process is suspended, all of that
CPU specific context must be saved in the task_struct for the process.
When a process is restarted by the scheduler its context is restored from here.
Next: Identifiers
Up: Processes
Previous: Processes
David A. Rusling
david.rusling@reo.mts.dec.com