Python os.walk

It’s very typical for a program to have to walk a file tree. In Recursion Example — Walking a file tree, I demonstrated how to use recursion to traverse a file system. Although it’s totally possible to walk through a file system in that fashion, it’s less than ideal because Python provides os.walk for this purpose.

The following script is a modified example borrowed from Programming Python: Powerful Object-Oriented Programming that demonstrates how to traverse a file system using os.walk.

import os
import sys


def lister(root):
    # os.walk returns a tuple with the current_folder, a list of sub_folders,
    # and a list of files in the current_folder
    for (current_folder, sub_folders, files) in os.walk(root):
        print('[' + current_folder + ']')
        for sub_folder in sub_folders:

            # Unix uses / as path separators, while Windows uses \
            # If we use os.path.join, we don't need to worry about which
            # path separator to use since os.path.join tracks that for us.
            path = os.path.join(current_folder, sub_folder)
            print('\t' + path)

        for file in files:
            path = os.path.join(current_folder, file)
            print('\t' + path)


if __name__ == '__main__':
    lister(sys.argv[1])

When run, this code prints out all of the files and directories starting at the specified root folder.

Explanation

os.walk

The os.walk function does the work of traversing a file system. The function generates a tuple with three fields. The first field is the current directory that os.walk is processing. The second field is a list of sub folders found in the current folder and the last field is a list of files found in the current folder.

Combining os.walk with a for loop is a very common technique (shown on line 8). The loop continues to iterate until os.walk finishes walking through the file system. The tuple declared in the for loop is updated on each iteration of the loop, providing developers with all of the information needed to process the contents of the directory.

os.path.join

Line 15 shows an example of using os.path.join to assemble a full path to a target folder or file. It’s import to use os.path.join to assemble file paths because Unix-like system use ‘/’ to separate file paths, while Windows systems use ‘\’. Tracking the path separator could be tedious work since it requires making a determination about which operating system is running the script. That’s not very ideal so Python provides os.path.join to take care of such work. As long as os.path.join is used, the assembled file paths will use the proper path separator for the os.

References

Lutz, Mark. Programming Python. Beijing, OReilly, 2013.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: