Python Basic Pipes

Python provides two main avenues of parallel processing. One avenue is to use multithreading where a program itself multitasks, while the other approach is to have a program relaunch itself as a separate program in a new process. One approach is not necessarily better than the other approach but instead, should be throught of as tools for different use cases. Threads have low overhead and share a program’s memory space, which allows for easy communication between threads. Processes operate as if we launched a new copy of the program from our operating system and allow programs to spread themselves out over an operating system or even a network.

However, processes do not share a global memory space, which means they need a way to communicate with one another. One approach to interprocess communication (IPC) is to use pipes. This post shows an example of IPC using pipes taken from Programming Python: Powerful Object-Oriented Programming. I have added my own comments to the code for clarity.

Code

import os, time


# Function called by child processes
def child(pipeout):
    zzz = 0
    while True:
        time.sleep(zzz)

        # We have to encode our string to binary to use
        # with pipes
        msg = ('Spam {}'.format(zzz)).encode()

        # Send the data back to the parent process
        os.write(pipeout, msg)
        zzz = (zzz + 1) % 5


def parent():
    # Creates our pipes. The pipeout gets passed to the child
    # process while parent keeps pipein
    pipein, pipeout = os.pipe()

    if os.fork() == 0:
        # We are now in the child process so call child and supply
        # it with pipeout so that it can send information back to
        # the parent.
        child(pipeout)
    else:
        # This is the parent process
        while True:
            # Read data from the child process
            # This call blocks until there is data
            line = os.read(pipein, 32)

            # Print to the console
            print('Parent {} got [{}] as {}'.format(os.getpid(), line, time.time()))


if __name__ == '__main__':
    parent()

Explanation

We have two functions in the program named child() and parent(). The child() function is intended to run in child processes while parent() contains the main program. Parent() is defined on lines 19-37. The function begins by calling os.pipe() on line 22 which returns a tuple containing two ends of a single pipe. Pipes are unidirectional and thus pipein is used by the parent to read data that comes from the child process. The child process uses pipeout to send data to the parent.

The program forks into two different processes on line 24. The program is in the child process when os.fork() returns zero. Line 28 calls the child() function and passes pipeout to the child function so that the child process can send data back to the parent. The child process enters an infinite loop on line 7. On line 12, a msg variable is created that contains a String variable. Pipe send binary data, so we have to call encode() on the String to convert it to a binary string. Then on line 15, we send the msg varaiable back to the parent using os.write and supplying pipeout and msg to that function.

The parent process continues on line 31. It attempts to read data from the child process on line 34 using os.read. Notice that os.read requires a pipein variable and the size of binary data to read (32 bytes in this program). If the pipe contains data, os.read returns immedialy and stores the value in the line variable. Otherwise, os.read blocks the program until the pipe has data. The parent process prints the data on line 37.

References

Lutz, Mark. Programming Python. Beijing, OReilly, 2013.

Advertisements

Python Pipe Operations

Python programs can link their input and output streams together using pipe (‘|’) operations. Piping isn’t a feature of Python, but rather comes from operating system. Nevertheless, a Python program can leverage such a feature to create powerful operating system scripts that are highly portable across platforms. For example, developer toolchains can be scripted together using Python. I personally have used Python to feed input into unit testing for my Java/Kotlin programs.

This post is a modified example of a demonstration found in Programming Python: Powerful Object-Oriented Programming. It uses a producer and consumer script to demonstrate one program feeding input into another Python program.

writer.py

Here is the code for writer.py

family = [
    'Bob Belcher',
    'Linda Belcher',
    'Tina Belcher',
    'Gene Belcher',
    'Louise Belcher'
]
for f in family:
    print(f)

Nothing special here. We are just building up a list that prints out the names of our favorite TV family, the Belchers.

reader.py

This code receives the output from writer.

while True:
    try:
        print('Entering {}'.format(input()))
    except EOFError:
        break

Once again, this is a simple script. Without a pipe operation, the input() statement on line 3 would normally collect the input from the keyboard. That’s not what is going to happen here.

Demonstration

We are going to execute these scripts by running the command below in the terminal.

python writer.py | python reader.py

The pipe '|' character does the job of connecting writer.py's output stream to reader.py's input stream. Thus, print statements in writer.py connect to input statements in reader.py. Here is the output of this operation.

Entering Bob Belcher
Entering Linda Belcher
Entering Tina Belcher
Entering Gene Belcher
Entering Louise Belcher