Recursion Example — Walking a file tree

Many developers use Python as a platform independent scripting language to perform file system operations. Sometimes it’s necessary to walk through a file system. Here is one way to navigate a file system recusively. (Of course, Python has libaries that do this!)

import os

def walk_fs(start_dir):
    # Get a list of everything in start_dir
    contents = os.listdir(start_dir)

    # This stores the output
    output = []

    # Loop through every item in contents
    for f in contents:
        # Use os.path.join to reassmble the path
        f_path = os.path.join(start_dir, f)

    # check if f_path is directory (or folder)
    if os.path.isdir(f_path):
        # Make recusive call to walk_fs
        output = output + walk_fs(f_path)
    else:
        # Add the file to output
        output.append(f_path)

    # Return a list of files in the directory
    return output

if __name__ == '__main__':
    try:
        result = walk_fs(input('Enter starting folder => '))
        for r in result:
            print(r)
    except FileNotFoundError:
    print('Not a valid folder! Try again!')

The key to this is to begin by using os.listdir, which returns a list of every item in a directory. Then we can loop through each item in contents. As we loop through contents, we need to reassemble the full path because f is only the name of the file or directory. We use os.path.join because it will insert either / (unix-like systems) or \ (windows) between each part of the path.

The next statement checks if f_path is a file or directory. The os.path.isdir function is True if the item is a directory, false otherwise. If f_path is a folder, we can make a recursive call to walk_fs starting with f_path. It will return a list of files that we can concat to output.

If f_path is a file, we just add it to output. When we have finished iterating through contents, we can return output. The output file will hold all of the files in start_dir and it’s subdirectorys.

Recursive Binary Search—Python

It’s possible to implement a binary search using recursion. Just like the loop version, we slice our sorted collection into halfs and check one half of the collection and discard the other half.

Here is the code

def binary_search(search_list, data, low, high):
    # This is the base case that
    # terminates the recursion
    if low <= high:
        # Find the mid point
        mid = int(low + (high - low) / 2)

        if search_list[mid] == data:
            # We found our item. Return the index
            return mid
        elif search_list[mid] < data:
            # Search the top half of search_list
            return binary_search(search_list, data, mid + 1, high)
        else:
            # Search the bottom half of the search_list
            return binary_search(search_list, data, low, mid - 1)
    return None


if __name__ == '__main__':
    names = ['Bob Belcher',
             'Linda Belcher',
             'Tina Belcher',
             'Gene Belcher',
             'Louise Belcher']

    print('Sorting names...')
    names.sort()

    linda_index = binary_search(names, 'Linda Belcher', 0, len(names) - 1)
    if linda_index:
        print('Linda Belcher found at ', str(linda_index))
    else:
        print('Linda Belcher was not found')

    teddy_index = binary_search(names, 'Teddy', 0, len(names) - 1)
    if teddy_index:
        print('Teddy was found at ', str(teddy_index))
    else:
        print('Teddy was not found')

When run, we get the following output

Sorting names...
Linda Belcher found at  2
Teddy was not found

Binary Search—Python

Binary searches work by splitting a collection in half and only searching on half of the collection while discarding the other half. This algorithm requires the collection to be sorted first.

Here’s the code

def binary_search(search_list, data):
    low = 0
    high = len(search_list) - 1

    while low <= high:
        mid = int(low + (high - low) / 2)

        if search_list[mid] == data:
            # Return the index because we
            # found our item
            return mid

        elif search_list[mid] < data:
            # Continue search at the top
            # half of search list
            low = mid + 1
        else:
            # Continue search at the
            # bottom of search list
            high = mid - 1

    # The item wasn't found
    return None


if __name__ == '__main__':
    names = ['Bob Belcher',
             'Linda Belcher',
             'Tina Belcher',
             'Gene Belcher',
             'Louise Belcher']

    print('Sorting names...')
    names.sort()

    linda_index = binary_search(names, 'Linda Belcher')
    if linda_index:
        print('Linda Belcher found at ', str(linda_index))
    else:
        print('Linda Belcher was not found')

    teddy_index = binary_search(names, 'Teddy')
    if teddy_index:
        print('Teddy was found at ', str(teddy_index))
    else:
        print('Teddy was not found')

When run, we will get the following output.

Sorting names...
Linda Belcher found at  2
Teddy was not found

Ordered Linear Search—Python

Ordered Linear Searche’s are a varient of Unordered Linear Search. The main difference is that this algorithm expects the search list to be sorted prior to searching. For this reason, the Ordered Linear Search can be a little more efficient than it’s Unordered varient.

Here’s the code

from random import shuffle


def ordered_linear_search(search_list, data):
    for i in range(len(search_list) - 1):
        if search_list[i] == data:
            # Return the index of the item
            return i
        elif search_list[i] > data:
            # The idea here is to fail at this
            # point because search_list isn't sorted
            return None
    # We didn't find our item, so return None
    return None


if __name__ == '__main__':
    names = ['Bob Belcher',
             'Linda Belcher',
             'Tina Belcher',
             'Gene Belcher',
             'Louise Belcher']

    print('Shuffling names')
    shuffle(names)

    linda_index = ordered_linear_search(names, 'Linda Belcher')
    if linda_index:
        print('Linda Belcher found at ', str(linda_index))
    else:
        print('Linda Belcher was not found')

    teddy_index = ordered_linear_search(names, 'Teddy')
    if teddy_index:
        print('Teddy was found at ', str(teddy_index))
    else:
        print('Teddy was not found')

    print('Now sorting...')
    names.sort()

    linda_index = ordered_linear_search(names, 'Linda Belcher')
    if linda_index:
        print('Linda Belcher found at ', str(linda_index))
    else:
        print('Linda Belcher was not found')

When run, you will notice that first we do not find the name ‘Linda Belcher’ because the list is shuffled. After sorting, the search will identifiy ‘Linda Belcher’ at index 2.

Shuffling names
Linda Belcher was not found
Teddy was not found
Now sorting...
Linda Belcher found at  2

Unordered Linear Search—Python

There are many cases in computer programming we need to search for an element in a collection of data. The Unordered Linear Search is one of the simplest searching algorithms.

def unordered_linear_search(search_list, data):
    for i in range(len(search_list) - 1):
        if search_list[i] == data:
            # We have found the item at index i
            return i
        
    # If we got here, we didn't find the item
    return None

if __name__ == '__main__':
    names = ['Bob Belcher',
             'Linda Belcher',
             'Tina Belcher',
             'Gene Belcher',
             'Louise Belcher']

    linda_index = unordered_linear_search(names, 'Linda Belcher')
    if linda_index:
        print('Linda Belcher found at ', str(linda_index))
    else:
        print('Linda Belcher was not found')

    teddy_index = unordered_linear_search(names, 'Teddy')
    if teddy_index:
        print('Teddy was found at ', str(teddy_index))
    else:
        print('Teddy was not found')

In this code, we are basically looping through the search_list object until we found a match. If we found the match, we return the index of the item’s location. Otherwise, we return Python’s None object to indicate the item wasn’t found. When run, the program will produce this output.

Linda Belcher found at  1
Teddy was not found