1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.

Saturday, February 26, 2011

Signals In Kernel Programming

Signals (kernel implementation and user programming)

What are they?

An asynchronous IPC mechanism supported in UNIX and further variants. Signal is general term used, where a process can signal another process, or keyboard generates interrupt and hence signals the kernel of availability of data, or a kernel signals a process of execution violence (eg: NULL pointer deference, divide by zero etc), or a shell controlling the execution of the commands.

To get the currently supported signals in the system, execute “kill -l”

Linux supports 31 non-realtime and 31 realtime signals as of now.

When a signal is delivered, process can choose to ignore it or handle them by a suitable handler, or allow the kernel to handle it. Every signal has a default action defined, and kernel performs that when it handles.

Signals are sent from terminal to a process, from process to another process or kernel to process. The signals sent from the kernel to process are hardware traps, which occurred while executing some instruction of the process. Eg: SIGILL is sent when an illegal instruction is executed. SIGABRT when you execute the abort instruction.

Eg:
Signal Name ---- Default Action Comment

----------------------------------------------------------------
SIGHUP --- Abort Hangup terminal or process

SIGINT --- Abort Keyboard interrupt (usually Ctrl-C)

SIGKILL ---- Abort Forced process termination

SIGUSR1 ---- Abort Process specific

SIGSEGV ---- Dump Invalid memory reference

Understanding some signals:

SIGHUP: This is received when the process which started the executing process, has terminated. An example is, the shell which executed this process, is no more. At that time SIGHUP is sent to all the processes which were started by it. We can bypass this by using nohup command while executing, which will not kill the process even if the terminal exits.

SIGINT: On press of ^C in terminal, this is generated, and default action is to kill that process.

SIGQUIT: Generated with ^\ where controlling terminal informs the process to quit normally. Again nohup can be used to ignore this.

SIGILL: The execute machine code aka opcode was not recognised by the processor,and it traps the kernel.

SIGTRAP: Used by the debugging utility to get the control back during program execution to themselves.

SIGBUS: An address alignment issue happened. Eg: Improper address is given on the bus

SIGKILL: Unless the system is unstable, this signal will terminate the reciever.

SIGSEGV: Raised when the fage fault raised by processor, couldnt be serviced by kernel. The reason is invalid region of memory is being accessed.

SIGUSR1, SIGUSR2: Signals for user processors to define their actions.

SIGALRM: This is used by the system alarm to inform the process of its firing.

SIGCHLD: The forked processes terminated using exit, and parent gets informed through this.

SIGCONT: Used by debuggers to inform the process to continue.

SIGTSTP: ^Z

SIGURG: Urgent data is now available on socket.


Where it is implemented?
------------------------------
Every task has signal support in its task_struct datastructure.

Struct task_struct{

struct signal_struct *signal;

struct sighand_struct *sighand;

sigset_t blocked, real_blocked;

sigset_t saved_sigmask; /* restored if set_restore_sigmask() was used */

struct sigpending pending;


}
• First variable is used to store the access properties. A task can send a signal to another task in same process group, or with another task with same uid and gid. Only task with super-user privilege can send signal to any other task. Some fatal signals can be sent to an entire group, and one process in that group processes it stopping others. All such information is stored in this.
• The second parameter defines the handlers. We can have upto 64 handlers, one for each signal.
• Next parameter defines the blocked signals (0-31) and real-time signals (32-63).
• Signal masks are used if you choose to ignore some signals. So this is stored in the next parameter.
• Any pending signals are accounted in the next variable

Operation states of signals:
--------------------------------
1. Signal Receiving/Handling
Every time the kernel returns to user space (from an interrupt/exception or system call), then it first checks if there are any non-blocked pending signals for this task. If so, then it will call do_signal(). Inside this function we check if we have any signal to be processed by repeatedly calling qeueue_signal(). Any signal which is not performed with default action yet, will be caught now.
handle_signal() will be invoked on all the dequeued signals obtained above. We have a complication here. We are executing in kernel space. Signal handlers are there in user space. Signal handlers themselves can make system calls, to make us come back to kernel space. How should we handle this? We setup frames for the signals on the user stack for this. We use put_user() to put it to user space. On an ia32 architecture, we make the SP point to this frame, IP point to the handler, AX to contain the signal number. Then we load the user data and code segments to CS and DS which means, the execution starts in user space.
If the task state is TASK_INTERRUPTIBLE, then it will be put back to run-queue after its state changed to TASK_RUNNING. If it is executing any slow system call like read() or write(), then we set a flag SA_RESTART before executing the signal handler, to denote that system call needs to be re-started after signal is handled.

2. Signal Sending
A signal is sent because of one of the events like send_sig_info() or kill_proc_info() inside kernel. Former is sent in case of exceptions, and later is for the terminal events etc. There is an info segment in these function, which differentiates whether kernel sent this signal or user space sent it.

3. Signal Pending
Because the task might be ignoring signals (TASK_UNINTERRUPTIBLE), the sent signal is not yet consumed. This will be saved in the pending variable discussed above. The problem is, if you send multiple signals of same type, then the signal handler will be executed only once. The reason is that any action that can be performed on the task, for a signal type, will have unique effect. And hence we cannot apply same action on it again. Eg: SIGKILL would kill task, but if you have delivered 100's of SIGKILL, then the very first time we processed it, the task is no more. So what is the advantage of storing all 100 signals of same type?

============================================================

User Space programming :

Signals are used in user space with the calls like signal(), sigaction(), sigaddset(), sigemptyset(), sigdelset(), kill() etc.

Lets see a simple example of signals:

Lets see another example on how to block some signals:

To register/recieve a signal:
struct sigaction mysig_act;
mysig_act.sa_flags = SA_SIGINFO;
mysig_act.sa_sigaction = (void *)mysig_handler;
if(sigaction (,&mysig_act,(struct sigaction *)NULL)) {
printf("Sigaction returned error = %d\n", errno);
exit(0);
}
struct sigaction {
void (*sa_handler)(int); /* func pointer */
void (*sa_sigaction)(int, siginfo_t *, void *); /*func pointer */
sigset_t sa_mask;
int sa_flags;
void (*sa_restorer)(void);
}
Or we can use signal(SIGNO, handler);
Sending signals:
int kill ( pid_t process_id, int signal_number );

Saturday, February 5, 2011

Posix Threads Programming

Designing Threaded Programs

Parallel Programming:
  • On modern, multi-cpu machines, pthreads are ideally suited for parallel programming, and whatever applies to parallel programming in general, applies to parallel pthreads programs.

  • There are many considerations for designing parallel programs, such as:
    • What type of parallel programming model to use?
    • Problem partitioning
    • Load balancing
    • Communications
    • Data dependencies
    • Synchronization and race conditions
    • Memory issues
    • I/O issues
    • Program complexity
    • Programmer effort/costs/time
    • ...

  • Programs having the following characteristics may be well suited for pthreads:
    • Work that can be executed, or data that can be operated on, by multiple tasks simultaneously
    • Block for potentially long I/O waits
    • Use many CPU cycles in some places but not others
    • Must respond to asynchronous events
    • Some work is more important than other work (priority interrupts)

  • Pthreads can also be used for serial applications, to emulate parallel execution. A perfect example is the typical web browser, which for most people, runs on a single cpu desktop/laptop machine. Many things can "appear" to be happening at the same time.

  • Several common models for threaded programs exist:

    • Manager/worker: a single thread, the manager assigns work to other threads, the workers. Typically, the manager handles all input and parcels out work to the other tasks. At least two forms of the manager/worker model are common: static worker pool and dynamic worker pool.

    • Pipeline: a task is broken into a series of suboperations, each of which is handled in series, but concurrently, by a different thread. An automobile assembly line best describes this model.

    • Peer: similar to the manager/worker model, but after the main thread creates other threads, it participates in the work
Shared Memory Model:
  • All threads have access to the same global, shared memory

  • Threads also have their own private data

  • Programmers are responsible for synchronizing access (protecting) globally shared data.
Thread-safeness:
  • Thread-safeness: in a nutshell, refers an application's ability to execute multiple threads simultaneously without "clobbering" shared data or creating "race" conditions.

  • For example, suppose that your application creates several threads, each of which makes a call to the same library routine:
    • This library routine accesses/modifies a global structure or location in memory.
    • As each thread calls this routine it is possible that they may try to modify this global structure/memory location at the same time.
    • If the routine does not employ some sort of synchronization constructs to prevent data corruption, then it is not thread-safe.

  • The implication to users of external library routines is that if you aren't 100% certain the routine is thread-safe, then you take your chances with problems that could arise.

  • Recommendation: Be careful if your application uses libraries or other objects that don't explicitly guarantee thread-safeness. When in doubt, assume that they are not thread-safe until proven otherwise. This can be done by "serializing" the calls to the uncertain routine, etc

Posix Threads Programming

What are Pthreads?

  • Historically, hardware vendors have implemented their own proprietary versions of threads. These implementations differed substantially from each other making it difficult for programmers to develop portable threaded applications.

  • In order to take full advantage of the capabilities provided by threads, a standardized programming interface was required. For UNIX systems, this interface has been specified by the IEEE POSIX 1003.1c standard (1995). Implementations which adhere to this standard are referred to as POSIX threads, or Pthreads. Most hardware vendors now offer Pthreads in addition to their proprietary API's.

  • Pthreads are defined as a set of C language programming types and procedure calls, implemented with a pthread.h header/include file and a thread library - though this library may be part of another library, such as libc.

  • There are several drafts of the POSIX threads standard. It is important to be aware of the draft number of a given implementation, because there are differences between drafts that can cause problems.

Why Pthreads?

  • The primary motivation for using Pthreads is to realize potential program performance gains.

  • When compared to the cost of creating and managing a process, a thread can be created with much less operating system overhead. Managing threads requires fewer system resources than managing processes.

    For example, the following table compares timing results for the fork() subroutine and the pthreads_create() subroutine. Timings reflect 50,000 process/thread creations, were performed with the time utility, and units are in seconds, no optimization flags.

    Note: don't expect the sytem and user times to add up to real time, because these are SMP systems with multiple CPUs working on the problem at the same time

    • All threads within a process share the same address space. Inter-thread communication is more efficient and in many cases, easier to use than inter-process communication.

    • Threaded applications offer potential performance gains and practical advantages over non-threaded applications in several other ways:
      • Overlapping CPU work with I/O: For example, a program may have sections where it is performing a long I/O operation. While one thread is waiting for an I/O system call to complete, CPU intensive work can be performed by other threads.
      • Priority/real-time scheduling: tasks which are more important can be scheduled to supersede or interrupt lower priority tasks.
      • Asynchronous event handling: tasks which service events of indeterminate frequency and duration can be interleaved. For example, a web server can both transfer data from previous requests and manage the arrival of new requests.

    • The primary motivation for considering the use of Pthreads on an SMP architecture is to achieve optimum performance. In particular, if an application is using MPI for on-node communications, there is a potential that performance could be greatly improved by using Pthreads for on-node data transfer instead.

    • For example:
      • MPI libraries usually implement on-node task communication via shared memory, which involves at least one memory copy operation (process to process).
      • For Pthreads there is no intermediate memory copy required because threads share the same address space within a single process. There is no data transfer, per se. It becomes more of a cache-to-CPU or memory-to-CPU bandwidth (worst case) situation. These speeds are much higher.

Posix Threads Programming

What is a Thread?

  • Technically, a thread is defined as an independent stream of instructions that can be scheduled to run as such by the operating system. But what does this mean?

  • To the software developer, the concept of a "procedure" that runs independently from its main program may best describe a thread.

  • To go one step further, imagine a main program (a.out) that contains a number of procedures. Then imagine all of these procedures being able to be scheduled to run simultaneously and/or independently by the operating system. That would describe a "multi-threaded" program.

  • How is this accomplished?
Before understanding a thread, one first needs to understand a UNIX process. A process is created by the operating system, and requires a fair amount of "overhead". Processes contain information about program resources and program execution state, including:
  • Process ID, process group ID, user ID, and group ID
  • Environment
  • Working directory.
  • Program instructions
  • Registers
  • Stack
  • Heap
  • File descriptors
  • Signal actions
  • Shared libraries
  • Inter-process communication tools (such as message queues, pipes, semaphores, or shared memory).

  • Threads use and exist within these process resources, yet are able to be scheduled by the operating system and run as independent entities largely because they duplicate only the bare essential resources that enable them to exist as executable code.

  • This independent flow of control is accomplished because a thread maintains its own:
    • Stack pointer
    • Registers
    • Scheduling properties (such as policy or priority)
    • Set of pending and blocked signals
    • Thread specific data.

  • So, in summary, in the UNIX environment a thread:
    • Exists within a process and uses the process resources
    • Has its own independent flow of control as long as its parent process exists and the OS supports it
    • Duplicates only the essential resources it needs to be independently schedulable
    • May share the process resources with other threads that act equally independently (and dependently)
    • Dies if the parent process dies - or something similar
    • Is "lightweight" because most of the overhead has already been accomplished through the creation of its process.

  • Because threads within the same process share resources:
    • Changes made by one thread to shared system resources (such as closing a file) will be seen by all other threads.
    • Two pointers having the same value point to the same data.
    • Reading and writing to the same memory locations is possible, and therefore requires explicit synchronization by the programmer.

 
# #