top of page
Writer's pictureThe Tech Platform

Multitasking In Python (Multithreading and Multiprocessing)


Modern operating systems allow a user to run several applications simultaneously, so that some tasks can be run in the background while the user continues with other work in the foreground. The user can also run multiple copies of the same program at the same time.

To create a task we can use process or thread. Process has its private resources including memory mapping, files and other os objects. Multiple threads can run on the same process and share all its resources but if one thread fail it will kill all other threads in its process.

Python has many packages to handle multi tasking, in this post i will cover some


os.fork

Unix/Linux/OS X specific (i.e. all but windows). The function creates a child process that start running after the fork return. We call fork once but it returns twice on the parent and on the child. Consider the following code:

import os
 
x=os.fork()
if x:
 print('hello parent')
else:
 print('hello child')

If you run it , you will see both prints:

hello parent
hello child

On fork call, the OS creates a child process and fork returns twice – on the parent the return value is the child process id and on the child fork returns 0. The point behind this weird behaviour is the Join-Fork parallel pattern – you run a code and want to use another processor for a complex task – fork a child, divide the work and join again:

import os
 
ret=0
# we need to do something complex so lets create a child
x=os.fork()
if x:
 print('do half work by parent')
 ret = os.wait()
else:
 print('do half work by child')
 os._exit(10)
 
print("only parent here res=" + str(os.WEXITSTATUS(ret[1])))

Before the fork call and after the if statement only one process exists , The child does its task and finish returning a value to the parent, If the parent finish before the child it will wait for him

It is important to call wait otherwise the child remains zombie


os.system

Runs a command in the shell (bash, cmd, etc)

sometimes you only need to run another program or command for example you want to do something that easily done with a command like format

import os
 
status = os.system('script2.py')
print ("exit status",status)


os.popen

Run a process and read its output as a file for example:

import os
 
for line in os.popen('ps aux').readlines():
 print(line)

The subprocess package

An attempt to replace os specific function with platform independent. You can run a process and open a pipe in any os for example:

import subprocess
 
subprocess.run('script2.py')
pr = subprocess.Popen('ls')
pr.wait();

to communicate with the output use:


import subprocess
import sys
 
proc = subprocess.Popen([sys.executable, 'myapp.py', 'data'],
 stdout = subprocess.PIPE,
 stderr = subprocess.PIPE)
(output, error) = proc.communicate()
if error != None:
 print("error:", error.decode())
print("output:", output.decode())

The multiprocessing package

Another useful package for process creation and other methods.

The multiprocessing module is suitable for sharing data or tasks between processor cores. It does not use threading, but processes instead. Processes are inherently more “expensive” that threads, so they are not worth using for trivial data sets or tasks


import multiprocessing
def fn(a,b):
 print ("vals=",a,b)
 
proc1 = multiprocessing.Process(target=fn, args=(10,20))
proc2 = multiprocessing.Process(target=fn, args=(10,20))
proc1.start()
proc2.start()
proc1.join()
proc2.join()


Working with Threads

Threads run on the same process address space – it is easy to share data between them but if one thread fails all other threads in the same process killed. To create a thread we use the threading package. We write a class derived from Thread and declare the function run. To create a thread we need to create an object from the class and call start method:

import threading
import time
class MyThread (threading.Thread):
 def run (self):
 for i in range(5):
 print ("In thread", self.name)
 time.sleep(2)
 
th1 = MyThread()
th2 = MyThread()
th1.setName("t1")
th2.setName("t2")
th1.start()
th2.start()
print ("From main")
th1.join()
th2.join()

output:

In thread t1
In thread t2
From main
In thread t1
In thread t2
In thread t1
In thread t2
....

You can also define the thread function as global (without class):


from threading import Thread
import time
 
def myfunc(*args):
 print "From thread", args
 time.sleep(5)
 
tid1 = Thread(target=myfunc, args='T1')
tid2 = Thread(target=myfunc, args='T2')
tid1.start()
tid2.start()
print "From main"
tid1.join()
tid2.join()


Synchronization Objects

Accessing resources needs synchronization. We can find the following objects in threading package:

  • Condition variables – similar to linux posix condition variables

  • Events – similar to windows events

  • Lock – similar to mutex

  • Semaphore


Lock Example:

import threading
import time
 
lc = threading.Lock()
 
 
def myfunc1(*args):
 global lc
 lc.acquire()
 print("From thread", args)
 lc.release()
 
def myfunc2(*args):
 global lc
 lc.acquire()
 print("From thread", args)
 lc.release()
 
 
tid1 = threading.Thread(target=myfunc1, args='T1')
tid2 = threading.Thread(target=myfunc2, args='T2')
tid1.start()
tid2.start()
print("From main")
tid1.join()
tid2.join()


Inter Process Communication (IPC)

Threads share the same address space so its easy to send data from one thread to another but processes live in a different address spaces and thats why we need an IPC object. You can use pipe, fifo, message queue and more


Message Queue Example:

In the following example we create 2 processes and use a queue for communication. One process send message to the queue and the other receive:


import multiprocessing
import time
 
def fn1(q):
 while True:
 x=q.get()
 print ("val=",x)
 
def fn2(q):
 while True:
 time.sleep(2)
 q.put("hello")
 
queue = multiprocessing.Queue()
proc1 = multiprocessing.Process(target=fn1, args=(queue,))
proc2 = multiprocessing.Process(target=fn2, args=(queue,))
proc1.start()
proc2.start()

You can find more methods for creating threads and processes, using synchronization objects and IPC


Source: devarea


The Tech Platform

0 comments

Recent Posts

See All

Comments


bottom of page