A quick look to processes on Linux

Emanuele Altieri (ealtieri@cs.smith.edu)
Prof. Nicholas Howe (nhowe@cs.smith.edu)
Smith College, June 2002

Processes on Linux
Sending signals to processes
Process creation

Processes on Linux

A process, also known as a task under Linux, is a running instance of a program. This means that if 10 users on a server are all using the same program, such as emacs, there are 10 emacs processes running on the server, although they all share the same executable code.

The processes on a UNIX system can be viewed using the ps command:

[ealtieri@italia labos]$ ps -e
  PID TTY          TIME CMD
    1 ?        00:00:04 init
    2 ?        00:00:00 keventd
    3 ?        00:00:00 ksoftirqd_CPU0
    4 ?        00:00:00 kswapd
    5 ?        00:00:00 bdflush
    6 ?        00:00:00 kupdated
    8 ?        00:00:00 khubd
    9 ?        00:00:00 kjournald
   14 ?        00:00:00 devfsd
  128 ?        00:00:00 kjournald
  ...          ...      ...
 2322 pts/1    00:00:05 emacs
 2421 pts/2    00:00:07 emacs
 2922 ?        00:00:00 aterm
 2923 pts/3    00:00:00 bash
 2979 pts/3    00:00:00 emacs
 2981 pts/3    00:00:00 ps

The -e option tells the command to show all of the processes on the system. Without this option, the command shows only the processes on the current terminal.

In the example above, the CMD column identifies the name of the running process, such as "emacs". The first column indicates the process identifier (PID) assigned to the process by the operating system. The second column shows the terminal associated with a process, or "?" if the process is not associated with any terminal (e.g. it's a terminal itself!). Finally, the third column shows the CPU time of the process.

We can see from the example above that there are three emacs processes running at the same time. Each of them corresponds to an emacs window on the screen. These processes have PIDs 2322, 2421 and 2979. Also notice the ps command at the end of the list. This is because the command itself is of course a process too.

The process ID (PID) is a unique identifier for a process. The operating system uses a 32-bit counter last_pid to keep track of the last PID assigned to a process. When a process is created, the counter is increased and its value becomes the PID of the new process. Because the counter may wrap around at some point, the kernel needs to check if the value of last_pid++ already belongs to a task, before it can assign it to a new process.

More information about the process list can be displayed using the -l option of the ps command.

[ealtieri@italia os]$ ps -l 
  F S   UID   PID  PPID  C PRI  NI ADDR    SZ  WCHAN TTY          TIME CMD
000 S   500  1518  1517  0  75   0    -   645 11cf00 pts/1    00:00:00 bash
000 S   500  1549  1518  0  71   0    -  2557 147d0d pts/1    00:00:02 emacs
000 R   500  1874  1518  0  79   0    -   660      - pts/1    00:00:00 ps

Recall that without the -e option, ps only shows the processes on the current terminal, in this case pts/1.

The first column (F) of the output above identifies the process flags (see manual page if you are interested). The "S" column indicates the state of a process. Possible state codes are the following:

D   uninterruptible sleep         (TASK_UNINTERRUPTIBLE)
R   runnable (on run queue)       (TASK_RUNNING)
S   sleeping                      (TASK_INTERRUPTIBLE)
T   traced or stopped             (TASK_STOPPED)
Z   a defunct ("zombie") process  (TASK_ZOMBIE)

You will notice that most of the processes on the system are sleeping, that is waiting for some kind of event, such as a mouse click or a key press. In the example above, the only running command is ps.

The output also shows the user owning the process (UID), the process identifier (PID), and the parent PID (PPID). The PPID identifies the process from which a given process originated. For example, you can see above that both emacs and ps have originated from the same bash shell (PID=1418), because their PPIDs are equal to the PID of bash. On the other hand, a process that originates from another process is called a child process.

Thanks to the PPID field, the process list can also be viewed as a tree, at the top of which lies the father of all of the processes: the init process (PID=1). This tree can be viewed with the pstree command.

[ealtieri@italia os]$ pstree 
init-+-atd
     |-bonobo-moniker-
     |-crond
     ...
     |-evolution-execu
     |-evolution-mail---evolution-mail---5*[evolution-mail]
     |-gconfd-1
     |-gdict
     |-gdm---gdm-+-X
     |           `-gnome-session---ssh-agent
     |-gmc
     |-gnome-name-serv
     |-gnome-smproxy
     |-gpm
     |-gweather
     |-kbdd-+-aterm---bash-+-emacs
     |      |              `-pstree
     |      |-evolution
     |      `-opera---opera---opera
     |-keventd
     ...

In bold you can see the three processes from the previous ps output.

Processes whose PPID equals 1 (init) and that do not have a controlling terminal are called daemons. These processes run in the background and are not normally visible to the user.

Another useful command is top. This command provides an ongoing look at the processor activity in real time. It displays a listing of the most CPU-intensive tasks on the system, and can provide an interactive interface for manipulating processes. The GUI counterpart of this command, called gnome-system-monitor, is shown below.

gtop

Q1. Give some examples of daemons running on your computer

Q2. Draw the process relationship tree for the following processes. Next to each node of the tree, write the PID of the corresponding process.

  F S   UID   PID  PPID  C PRI  NI ADDR    SZ  WCHAN TTY          TIME CMD
000 S   500  1492  1491  0  75   0    -   641 11cf00 pts/0    00:00:00 bash
000 S   500  2281  1492  0  69   0    -  2790 147d0d pts/0    00:00:00 emacs
000 S   500  2284  1492  0  69   0    -   767 147d0d pts/0    00:00:00 aterm
000 S   500  2336  1492  0  69   0    -  4282 148496 pts/0    00:00:00 xmms
040 S   500  2337  2336  0  69   0    -  4282 148496 pts/0    00:00:00 xmms
040 S   500  2338  2337  0  69   0    -  4282 147d0d pts/0    00:00:00 xmms
040 S   500  2342  2337  0  69   0    -  4282 121f03 pts/0    00:00:00 xmms
000 S   500  2352  1492  0  69   0    -   554 11cf00 pts/0    00:00:00 opera
000 S   500  2353  2352  1  70   0    -  4597 147d0d pts/0    00:00:01 opera
040 S   500  2354  2353  0  68   0    -  4597 148496 pts/0    00:00:00 opera
000 R   500  2359  1492  0  79   0    -   660      - pts/0    00:00:00 ps

Sending signals to processes

Signals are sometimes referred to as software interrupts. Similarly to hardware interrupts, signals are random interruptions in the execution of program that signal an event to that program. Examples of events could be a segmentation fault caused by the program itself (SIGSEGV), or a CTRL-C key combination sent by the user (SIGINT). A complete list of signals can be viewd by typing "man 7 signal". Signals are defined in include/asm/signal.h (line 31).

Signals are sent to a program using the kill command. For example:

[ealtieri@italia os]$ sleep 20 &
[1] 2611
[ealtieri@italia os]$ ps
  PID TTY          TIME CMD
 2535 pts/4    00:00:00 bash
 2611 pts/4    00:00:00 sleep
 2612 pts/4    00:00:00 ps
[ealtieri@italia os]$ kill -s SIGINT 2611
[ealtieri@italia os]$ 
[1]+  Interrupt               sleep 20
[ealtieri@italia os]$

The sleep command creates a process that sleeps for 20 seconds and then terminates (the "&" symbol tells bash to run the command in the background). We can see this process in the output of ps immediately next. We then send a CTRL-C signal (SIGINT) to the sleep command using kill and the PID of sleep. The signal terminates the program. This is confirmed by the message that appears on the terminal if you press enter once after the kill command has been issued.

One of the most useful applications of signals is to request the termination of a program. There are two termination signals that can be sent to a program: SIGTERM and SIGKILL.

SIGTERM terminates a process nicely. The process has a chance to catch this signal and do some cleanup work before terminating. However, a process can also choose to ignore this signal.
SIGKILL terminates a process immediately. The process does not have a chance to catch this signal. SIGKILL should only be used to kill a process that crashed and does not respond to the SIGTERM signal.

The following example shows how to kill emacs nicely:

[ealtieri@italia os]$ emacs &
[1] 2695
[ealtieri@italia os]$ ps
  PID TTY          TIME CMD
 2535 pts/4    00:00:00 bash
 2695 pts/4    00:00:00 emacs
 2696 pts/4    00:00:00 ps
[ealtieri@italia os]$ kill -s SIGTERM 2695
[ealtieri@italia os]$ 
[1]+  Terminated              emacs
[ealtieri@italia os]$

Without the -s <sig> option, kill sends a SIGTERM signal by default, so the third line above could be rewritten simply as:

kill 2695

From the point of view of a process, a signal can be either ignored, sent to a default handler, or caught by a function handler provided by the process. To make your program catch a signal you simply need to add the following code to the beginning of main():

#include <signal.h>

...

int main(void)
{
	signal(SIGINT, &on_sigint);   /* catch CTRL^C */
	...

where on_sigint is the function in charge of handling the signal, and could be implemented as shown below:

/* CTRL^C handler */
void on_sigint(int signo)
{
	printf("CTRL^C pressed!\n");
	
	/* do cleanup work... */
	
	exit(0);
}

sigint.c is the simplest example on how to catch the SIGINT (CTRL^C) signal.

	Q3. Modify `signint.c` so that it also catches the SIGTERM signal. Make the program display different messages for the SIGINT and SIGTERM signals.
	Q4. The counterpart of the `kill` bash command in C is the `kill(pid_t pid, int sig)` function (see the `kill(2)` manual page). Use this function to create a program that sends a `SIGINT` signal to the PID specified in the command line.
	Q5. Some signals cannot be ignored or caught by processes. One example is the `SIGKILL` signal, described earlier. Read the `signal(7)` manual page and find out which other signal cannot be ignored or caught by a process. Also find out the default action for the `SIGINT` signal.

Process Creation

Process creation under Linux is made possible by a few system calls. The most important is the fork() function.

#include <sys/types.h>
#include <unistd.h>

pid_t fork(void);

fork() works by splitting the caller process into a parent and child process. Both the parent and the child processes share the same executable code, but have different data and stack segments. An example on how to use this system call is shown below:

#include <sys/types.h>
#include <unistd.h>

int main() {
        pid_t pid;

	pid = fork();  /* the process is split here !! */

	/* 
	 * Now we have two processes (child and parent) running on the
	 * same executable code. The way we tell who we are is by
	 * looking at the return value of fork() - 0 for the child
	 * process and > 0 for the parent.
	 */
	
	if (pid == 0) {   
                /* CHILD PROCESS */
		printf("Child process!\n");
		exit(0);
	}

        /* PARENT PROCESS */
	printf("Parent process!\n");
	exit(0);
}

A slight modified version of the program above prints the process list before and after the fork() function is called. This is the resulting output:

[ealtieri@italia process]$ ./a.out 
Processes before forking:
UID        PID  PPID  C STIME TTY          TIME CMD
ealtieri  2242  2241  0 11:23 pts/2    00:00:00 bash
ealtieri  2980  2242  0 15:35 pts/2    00:00:00 ./a.out
ealtieri  2981  2980  0 15:35 pts/2    00:00:00 ps -f
Processes after forking:
UID        PID  PPID  C STIME TTY          TIME CMD
ealtieri  2242  2241  0 11:23 pts/2    00:00:00 bash
ealtieri  2980  2242  0 15:35 pts/2    00:00:00 ./a.out     // parent
ealtieri  2982  2980  0 15:35 pts/2    00:00:00 ./a.out     // child
ealtieri  2983  2980  0 15:35 pts/2    00:00:00 ps -f

As you can see, after fork() the original process is split into two processes, the child and the parent. By looking at the PID and PPID of the two a.out processes we can tell that the second a.out is the child, while the first is the parent.

fork.c uses fork() to create a child process. The program prints the list of processes before and after the forking.

Q6. Consider the following fragment of code taken from fork1.c:

int main()
{
	pid_t pid;
	int look_this;

        look_this = 99;

        if ((pid = fork()) < 0) {
                perror("fork()");
		exit(1);
        }
	else if (pid == 0)
		/* CHILD PROCESS */
                look_this = 33;
		exit(0);
	}

	/* PARENT PROCESS */

	wait(NULL);  /* wait for child to terminate */
	printf("The value of look_this is %d\n", look_this);
	
	exit(0);
}

The wait(2) function puts the parent process to sleep until the child process terminates. According to the definition of fork(), which value of look_this will be displayed on the screen? Explain your answer.

So far we talked about how to fork a new process from an existing one, but what if we want to execute a program? For this purpose Linux provides the exec() family of functions (see the exec(3) manual page). Differently from fork(), the exec() functions do not create a process. The functions replace the image of the caller process with the program specified as the first argument. This means that the caller process terminates and the program specified in exec() takes its place.

For example:

#include <unistd.h>

int main()
{
	/*    FILE TO EXECUTE          ARG0        */
        execl("/usr/local/bin/emacs", "emacs", NULL);

	/* should never get here, unless execl() returned with an error */
	perror("execl()");
	exit(1);
}

The program above tries to execute emacs. If the execl() function succedes, the process will be replaced by emacs and we will never reach the following lines. However, execl() may not find the specified program, return an error and continue execution of the current process, therefore we always need to provide some error handling after the call.

A slightly modified version of the program above prints the process list just before calling execl(). We can then manually check the list again with the ps command after the exec() fnction has been called.

[ealtieri@italia process]$ ./a.out &
[2] 3106
Processes before the exec() call:
UID        PID  PPID  C STIME TTY          TIME CMD
ealtieri  2242  2241  0 11:23 pts/2    00:00:00 bash
ealtieri  3106  2242  0 16:14 pts/2    00:00:00 ./a.out
ealtieri  3107  3106  0 16:14 pts/2    00:00:00 ps -f
[ealtieri@italia process]$ ps -f
UID        PID  PPID  C STIME TTY          TIME CMD
ealtieri  2242  2241  0 11:23 pts/2    00:00:00 bash
ealtieri  3106  2242  0 16:14 pts/2    00:00:00 emacs
ealtieri  3108  2242  0 16:15 pts/2    00:00:00 ps -f

After the call to execl(), the original process disappeared and in its place we have emacs. Notice how emacs maintains the PID and PPID of the original process.

exec.c shows the use of the execl() function.

Q7. The execution of a program involves two steps. First, a new process has to be forked by a parent task, such as a shell. Second, the child process executes the desired program using one of the exec(3) functions. This way, the child is replaced with the image of the new program, while the parent is untouched.

Implement a small shell that repeatedly asks for the path of a program and then executes that program using fork() and execlp(). The shell must terminate when the user types "exit". The shell shouldn't prompt for a new program to execute until the previous one has terminated. You can use the wait(2) function to make the parent process wait for the child to terminate.

Resources

Advanced Programming in the UNIX Environment, by Richard W. Stevens. Addison Wesley

A quick look to processes on Linux

Contents

Processes on Linux

Sending signals to processes

Process Creation

Resources