Python provides two main avenues of parallel processing. One avenue is to use multithreading where a program itself multitasks, while the other approach is to have a program relaunch itself as a separate program in a new process. One approach is not necessarily better than the other approach but instead, should be throught of as tools for different use cases. Threads have low overhead and share a program’s memory space, which allows for easy communication between threads. Processes operate as if we launched a new copy of the program from our operating system and allow programs to spread themselves out over an operating system or even a network.
However, processes do not share a global memory space, which means they need a way to communicate with one another. One approach to interprocess communication (IPC) is to use pipes. This post shows an example of IPC using pipes taken from Programming Python: Powerful Object-Oriented Programming. I have added my own comments to the code for clarity.
Code
import os, time # Function called by child processes def child(pipeout): zzz = 0 while True: time.sleep(zzz) # We have to encode our string to binary to use # with pipes msg = ('Spam {}'.format(zzz)).encode() # Send the data back to the parent process os.write(pipeout, msg) zzz = (zzz + 1) % 5 def parent(): # Creates our pipes. The pipeout gets passed to the child # process while parent keeps pipein pipein, pipeout = os.pipe() if os.fork() == 0: # We are now in the child process so call child and supply # it with pipeout so that it can send information back to # the parent. child(pipeout) else: # This is the parent process while True: # Read data from the child process # This call blocks until there is data line = os.read(pipein, 32) # Print to the console print('Parent {} got [{}] as {}'.format(os.getpid(), line, time.time())) if __name__ == '__main__': parent()
Explanation
We have two functions in the program named child() and parent(). The child() function is intended to run in child processes while parent() contains the main program. Parent() is defined on lines 19-37. The function begins by calling os.pipe() on line 22 which returns a tuple containing two ends of a single pipe. Pipes are unidirectional and thus pipein is used by the parent to read data that comes from the child process. The child process uses pipeout to send data to the parent.
The program forks into two different processes on line 24. The program is in the child process when os.fork() returns zero. Line 28 calls the child() function and passes pipeout to the child function so that the child process can send data back to the parent. The child process enters an infinite loop on line 7. On line 12, a msg variable is created that contains a String variable. Pipe send binary data, so we have to call encode() on the String to convert it to a binary string. Then on line 15, we send the msg varaiable back to the parent using os.write and supplying pipeout and msg to that function.
The parent process continues on line 31. It attempts to read data from the child process on line 34 using os.read. Notice that os.read requires a pipein variable and the size of binary data to read (32 bytes in this program). If the pipe contains data, os.read returns immedialy and stores the value in the line variable. Otherwise, os.read blocks the program until the pipe has data. The parent process prints the data on line 37.
References
Lutz, Mark. Programming Python. Beijing, OReilly, 2013.