Python Producer Consumer with Queue

The producer / consumer pattern is a common programming construct used in multithreaded applications where one thread acts as a producer of data while other threads consume the data. A web crawler application is a use case of the producer / consumer pattern. For example, the application may have a thread dedicated to crawling the web that gathers data (producer) while other threads index and store the data (consumers).

Producer and consumer threads need a way to share data. Python’s queue module provides one of many solutions. The Queue object is a FIFO object that lets the produce thread place data on the queue. Consumer threads are blocked by the Queue until the Queue has data for the consumer thread to read. When data becomes available, the consumer thread removes data from the Queue and does its work.

Below is an example program borrowed from Programming Python: Powerful Object-Oriented Programming that shows how to use a Queue to synchronize data between producer and consumer threads. I added my own comments to the code to help explain what is happening in the program.

Code

# Specify the number of consumer and producer threads
numconsumers = 2
numproducers = 4
nummessages = 4

import _thread as thread, queue, time

# Create a lock so that only one thread writes to the console at a time
safeprint = thread.allocate_lock()

# Create a queue object
dataQueue = queue.Queue()


# Function called by the producer thread
def producer(idnum):
    # Produce 4 messages to place on the queue
    for msgnum in range(nummessages):
        # Simulate a delay
        time.sleep(idnum)

        # Put a String on the queue
        dataQueue.put('[producer id={}, count={}]'.format(idnum, msgnum))


# Function called by the consumer threads
def consumer(idnum):
    # Create an infinite loop
    while True:
        # Simulate a delay
        time.sleep(0.1)
        try:
            # Attempt to get data from the queue. Note that
            # dataQueue.get() will block this thread's execution
            # until data is available
            data = dataQueue.get()
        except queue.Empty:
            pass
        else:
            # Acquire a lock on the console
            with safeprint:
                # Print the data created by the producer thread
                print('consumer ', idnum, ' got => ', data)


if __name__ == '__main__':
    # Create consumers
    for i in range(numconsumers):
        thread.start_new_thread(consumer, (i,))
        
    # Create producers
    for i in range(numproducers):
        thread.start_new_thread(producer, (i,))
        
    # Simulate a delay
    time.sleep(((numproducers - 1) * nummessages) + 1)
    
    # Exit the program
    print('Main thread exit')

Detailed Explanation

This program shows the producer / consumer pattern in action. We begin by defining variables that specify the number of consumer threads (line 2), the number of produce threads (line 3), and the number of messages the producer threads make (line 4). The program creates a lock on line 9 so that only one thread can use the console at the same time. Then on line 12, the queue is created as a global variable.

Our first function, producer, is defined on lines 16-23. There isn’t anything fancy going on in this function. The function simply enters a for-each loop and creates 4 strings that are placed on dataQueue (line 23). Since dataQueue is a FIFO structure, worker threads will remove these Strings from dataQueue in the order they are recieved.

Lines 27-43 define the consumer thread function, consumer. This code enters an infinite loop and removes data from dataQueue and prints the String to the console. Line 36 is the critical piece of code in the consumer function. The call to get() on dataQueue removes the item at the front of the queue and stores it in the variable data. If dataQueue is empty, the consumer thread is blocked until data becomes available.

Alternatively, we could pass false to the optional block parameter on get(). That would cause the thread to continue to execute even if the queue is empty. However, we need to be prepared for situations where the queue is empty and catch the queue.Empty exception that is thrown. Our program calls pass to skip over the exception should this happen (it shouldn’t be the way, because we are using the blocking version of get()).

Lines 48-49 create our producer threads and start them. Lines 52-53 create and start the consumer threads. The producer threads call the produce function while the consumer threads call the consume function. The dataQueue object does the job of synchronizing data between threads. The produce threads write to dataQueue and consumer threads read from it. Thus, our program has created the consumer / producer pattern.

References

Lutz, Mark. Programming Python. Beijing, OReilly, 2013.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s