It’s very typical for a program to have to walk a file tree. In Recursion Example — Walking a file tree, I demonstrated how to use recursion to traverse a file system. Although it’s totally possible to walk through a file system in that fashion, it’s less than ideal because Python provides os.walk for this purpose.
The following script is a modified example borrowed from Programming Python: Powerful Object-Oriented Programming that demonstrates how to traverse a file system using os.walk.
import os import sys def lister(root): # os.walk returns a tuple with the current_folder, a list of sub_folders, # and a list of files in the current_folder for (current_folder, sub_folders, files) in os.walk(root): print('[' + current_folder + ']') for sub_folder in sub_folders: # Unix uses / as path separators, while Windows uses \ # If we use os.path.join, we don't need to worry about which # path separator to use since os.path.join tracks that for us. path = os.path.join(current_folder, sub_folder) print('\t' + path) for file in files: path = os.path.join(current_folder, file) print('\t' + path) if __name__ == '__main__': lister(sys.argv[1])
When run, this code prints out all of the files and directories starting at the specified root folder.
Explanation
os.walk
The os.walk function does the work of traversing a file system. The function generates a tuple with three fields. The first field is the current directory that os.walk is processing. The second field is a list of sub folders found in the current folder and the last field is a list of files found in the current folder.
Combining os.walk with a for loop is a very common technique (shown on line 8). The loop continues to iterate until os.walk finishes walking through the file system. The tuple declared in the for loop is updated on each iteration of the loop, providing developers with all of the information needed to process the contents of the directory.
os.path.join
Line 15 shows an example of using os.path.join to assemble a full path to a target folder or file. It’s import to use os.path.join to assemble file paths because Unix-like system use ‘/’ to separate file paths, while Windows systems use ‘\’. Tracking the path separator could be tedious work since it requires making a determination about which operating system is running the script. That’s not very ideal so Python provides os.path.join to take care of such work. As long as os.path.join is used, the assembled file paths will use the proper path separator for the os.
References
Lutz, Mark. Programming Python. Beijing, OReilly, 2013.