Skip to main content

Threads

A thread of execution is often regarded as the smallest unit of processing that a scheduler works on.
A process can have multiple threads of execution which are executed asynchronously.
This asynchronous execution brings in the capability of each thread handling a particular work or service independently. Hence multiple threads running in a process handle their services which overall constitutes the complete capability of the process.
Why Threads are Required?
Suppose there is a process, that receiving real time inputs and corresponding to each input it has to produce a certain output. Now, if the process is not multi-threaded, then the whole processing in the process becomes synchronous. This means that the process takes an input, processes it and produces an output.
The limitation in the above design is that the process cannot accept an input until its done processing the earlier one and in case processing an input takes longer than expected then accepting further inputs goes on hold.
To consider the impact of the above limitation, if we map the generic example above with a  socket server process that can accept input connection, process them and provide the socket client with output. Now, if in processing any input if the server process takes more than expected time and in the meantime another input (connection request) comes to the socket server then the server process would not be able to accept the new input connection as its already stuck in processing the old input connection. This may lead to a connection time out at the socket client which is not at all desired.
This shows that synchronous model of execution cannot be applied everywhere and hence was the requirement of asynchronous model of execution felt which is implemented by using threads.
Difference Between threads and processes
·         Processes do not share their address space while threads executing under same process share the address space.
·         From the above point its clear that processes execute independent of each other and the synchronization between processes is taken care by kernel only while on the other hand the thread synchronization has to be taken care by the process under which the threads are executing.
·         Context switching between threads is fast as compared to context switching between processes.
·         The interaction between two processes is achieved only through the standard inter process communication while threads executing under the same process can communicate easily as they share most of the resources like memory, text segment etc
Thread Identification
Just as a process is identified through a process ID, a thread is identified by a thread ID.
·         A process ID is unique across the system where as a thread ID is unique only in context of a single process.
·         A process ID is an integer value but the thread ID is not necessarily an integer value. It could well be a structure
Thread ID is represented by the type ‘pthread_t’. The function ‘pthread_self()’ is used by a thread for printing its own thread ID.
Thread Creation
Normally when a program starts up and becomes a process, it starts with a default thread. So we can say that every process has at least one thread of control.  A process can create extra threads using the following function :
#include <pthread.h>
int pthread_create(pthread_t *restrict tidp, const pthread_attr_t *restrict attr,
                                        void *(*start_rtn)(void), void *restrict arg)
·         The first argument is a pthread_t type address. Once the function is called successfully, the variable whose address is passed as first argument will hold the thread ID of the newly created thread.
·         The second argument may contain certain attributes which we want the new thread to contain.  It could be priority etc.
·         The third argument is a function pointer. This is something to keep in mind that each thread starts with a function and that functions address is passed here as the third argument so that the kernel knows which function to start the thread from.
·         As the function (whose address is passed in the third argument above) may accept some arguments also so we can pass these arguments in form of a pointer to a void type. Now, why a void type was chosen? This was because if a function accepts more than one argument then this pointer could be a pointer to a structure that may contain these arguments.
Example- C program using threads
#include<stdio.h>
#include<string.h>
#include<pthread.h>
#include<stdlib.h>
#include<unistd.h>
pthread_t tid[2];
void* doSomeThing(void *arg)
{
    unsigned long i = 0;
    pthread_t id = pthread_self();
    if(pthread_equal(id,tid[0]))
    {
        printf("\n First thread processing\n");
    }
    else
    {
        printf("\n Second thread processing\n");
    }
    for(i=0; i<(0xFFFFFFFF);i++);
    return NULL;
}
int main(void)
{
    int i = 0;
    int err;
    while(i < 2)
    {
        err = pthread_create(&(tid[i]), NULL, &doSomeThing, NULL);
        if (err != 0)
            printf("\ncan't create thread :[%s]", strerror(err));
        else
            printf("\n Thread created successfully\n");
        i++;
    }
    sleep(5);
    return 0;
}
So what this code does is :
·         It uses the pthread_create() function to create two threads
·         The starting function for both the threads is kept same.
·         Inside the function ‘doSomeThing()’, the thread uses pthread_self() and pthread_equal() functions to identify whether the executing thread is the first one or the second one as created.
·         Also, Inside the same function ‘doSomeThing()’ a for loop is run so as to simulate some time consuming work.
Now, when the above code is run, following was the output :
$ ./threads
Thread created successfully
First thread processing
Thread created successfully
Second thread processing
As seen in the output, first thread is created and it starts processing, then the second thread is created and then it starts processing. Well one point to be noted here is that the order of execution of threads is not always fixed. It depends on the OS scheduling algorithm.


Comments

Popular posts from this blog

Server/Client Communication-python

The basic mechanisms of client-server setup are: A client app send a request to a server app.  The server app returns a reply.  Some of the basic data communications between client and server are: File transfer - sends name and gets a file.  Web page - sends url and gets a page.  Echo - sends a message and gets it back.  Client server communication uses socket.              To connect to another machine, we need a socket connection. What's a connection?  A relationship between two machines, where two pieces of software know about each other. Those two pieces of software know how to communicate with each other. In other words, they know how to send bits to each other. A socket connection means the two machines have information about each other, including network location (IP address) and TCP port. (If we can use anology, IP address is the phone number and the TCP port is the extension).  A so...

Banker's Algorithm

Banker's algorithm is a deadlock avoidance algorithm. It is named so because this algorithm is used in banking systems to determine whether a loan can be granted or not. Consider there are n account holders in a bank and the sum of the money in all of their accounts is S. Everytime a loan has to be granted by the bank, it subtracts the loan amount from the total money the bank has. Then it checks if that difference is greater than S. It is done because, only then, the bank would have enough money even if all the n account holders draw all their money at once. Banker's algorithm works in a similar way in computers. Whenever a new process is created, it must exactly specify the maximum instances of each resource type that it needs. Let us assume that there are n processes and m resource types. Some data structures are used to implement the banker's algorithm. They are: Available: It is an array of length m . It represents the number of available resourc...

Inter Process Communication-Message Queue

Interprocess communication (IPC) is a set of programming interfaces that allow a programmer to coordinate activities among different program processes that can run concurrently in an operating system. This allows a program to handle many user requests at the same time. Since even a single user request may result in multiple processes running in the operating system on the user's behalf, the processes need to communicate with each other. The IPC interfaces make this possible. Each IPC method has its own advantages and limitations so it is not unusual for a single program to use all of the IPC methods . Message Based Communication Messages are a very general form of communication. Messages can be used to send and receive formatted data streams between arbitrary processes. Messages may have types. This helps in message interpretation. The type may specify appropriate permissions for processes. Usually at the receiver end, messages are put in a queue. Messages may also be fo...