Python _thread Mutex

When Python programs create threads, each thread shares the same global memory as every other thread. Usually, but not always, multiple threads can safely read from shared resources without issue. Threads writing to shared resources are a different story because one thread could potentially overwrite the work of another thread.

This post demonstrates an example program shown in Programming Python: Powerful Object-Oriented Programming where threads acquire and release locks in the program. The locking mechanism ensures that only one thread has access to a shared resource at a time.

Code

Here is an example program with my own comments added.

import _thread as thread, time

# This mutex object is created by calling
# thread.allocate_lock()
# The mutex is responsible for synchronizing threads
mutex = thread.allocate_lock()


def counter(tid, count):
    for i in range(count):
        time.sleep(1)
        
        # The standard out is a shared resource
        # Unless the program controls access to the standard out
        # multiple threads can print to standard out at the same time
        # which results in garbage output
        
        # Acquire a lock
        mutex.acquire()
        
        # Now only the current thread can print to the console
        print('[{}] => {}'.format(tid, i))
        
        # Make sure to release the lock for other threads when finished
        mutex.release()


if __name__ == '__main__':
    for i in range(5):
        thread.start_new_thread(counter, (i, 5))

    time.sleep(6)
    print('Main thread exiting...')

Explanation

The program creates five threads, each of which needs access to the standard output stream. The standard output stream is a global object that all of the threads share, which means that each thread can call print at the same time. That isn’t ideal because we can get garbage output printed to the console if two threads call the print() statement at the same time.

The solution is to lock access to the standard output stream so that only one thread may use it at a time. We do this by creating a mutex object on line 6 in the program by using thread.allocate_lock(). When a thread needs a lock, it calls acquire() on the mutex. At that point, all other threads that need protected resources have to sit and wait for mutex.release().

It’s important to keep the operations between mutex.acquire() and mutex.release() as brief as possible. Only one thread can hold a lock at a time, so the longer one thread holds a lock, the longer other threads need to wait for their turn to use the lock. That naturally impacts the performance of the overall program.

References

Lutz, Mark. Programming Python. Beijing, OReilly, 2013.

Advertisements

Python _thread Basic

Python 3 has the newer thread package, but the _thread package still exists for developers who are more comfortable with the 2.x API. This is a basic example derived from Programming Python: Powerful Object-Oriented Programming that demonstrates how to create threads using the _thread module.

Code

import _thread as thread, time


# This function will run in a new thread
def counter(tid, count):
    for i in range(count):
        # Simulate a delay
        time.sleep(1)
        # Print out the thread id (tid) and the current iteration
        # of our for loop
        print('[{}] => {}'.format(tid, i))


if __name__ == '__main__':
    # Enter a loop that creates 5 threads
    for i in range(5):
        # Start a new thread passing a callable and it's arguments
        # in the form of a tuple
        thread.start_new_thread(counter, (i, 5))

    time.sleep(6)
    print('Main thread exiting')

Explanation

This script creates five new threads using _thread.start_new_thread. Each time a new thread is created, the counter function is called and is passed a tuple of (i, 5). That tuple corresponds to the tid and count parameters of the counter function. Counter enters a loop that runs 5 times since 5 was passed to the second parameter of counter on line 19. It will print the thread id and current iteration of the loop.

Meanwhile, the for loop in the parent thread continues to iterate because thread.start_new_thread does not block the for loop in the main thread. By calling start_new_thread, the program’s execution runs both the for loop in the main thread and the counter function in parallel. Allowing programs to run multiple portions of code at the same time is what gives threads their power. For example, you may wish to use a thread to handle a long running database query while the user continues to interact with the program in the main thread.

One final note about threads in Python. Threads give the appearance of allowing programs to multitask and for all intents and purposes, that is what is happening in the program. Nevertheless, what is really happening is that the Python Virtual Machine is time slicing the computer instructions and allowing a few lines of code to run before switching to another set of instructions.

In other words, if a program has three threads, A, B, and C, then Thread A runs for a few moments, then Thread B, and finally Thread C. Note that there is no guarantee to the order in which threads run. It is possible that one thread may run more often than other threads or that the order of running threads is different each time.

References

Lutz, Mark. Programming Python. Beijing, OReilly, 2013.