Python Line Scanner

This post borrows from a code example found in Programming Python: Powerful Object-Oriented Programming that demonstrates collecting command line arguments, opening a file, reading the file, and passing a function as a callback to another function.

Code

Here is the entire script that accepts a file as a command line argument and prints the contents of the file to the console.


def scanner(name, func):

    # Open the file (with statement ensures closure even if there is an exception)
    with open(name, 'r') as f:
        # Iterate through the file
        for line in f:
            # Call our callback function
            func(line)

if __name__ == '__main__':
    import sys
    name = sys.argv[1]

    # This is a function we are passing to scanner
    # Python has first class functions which can be
    # get passed as arguments to other functions
    def print_line(str):
        print(str, end='')

    # Call the scanner function, which in turn
    # calls the print_line function for each line
    # in the file
    scanner(name, print_line)

Command Line Arguments

The first concept covered in this script is processing command line arguments. Python requires us to import the sys module (line 12) which maintains an argv property. The argv property is a list-like object that contains all of the command line arguments used to hold all of the command line parameters. The first index [0] is the name of the script, followed by all of the other arguments supplied to the program.

On line 13, we grab the target file (stored in argv[1]) and keep it in a name variable. At this point, our program knows which file to the open later on when we use the scanner function.

First Class Functions

Python treats functions as objects. As such, we can define any function in a Python program and store it in a variable just like anything else. Lines 18-19 define a print_line function that accepts a String parameter. On line 24, print_line is the second argument to the scanner function.

Once inside of the scanner function, the print_line function is referenced by the variable func. On line 9, we call print_line with the func(line) rather than print_line(line). This works because func and print_line both refer to the same function object in memory. Passing functions in this fashion is incredibly powerful because it allows the scanner function to accept different behaviors for each line it processes.

For example, we could define a function the writes each line processed by scanner to a file rather than printing it to the console. Later on, we may choose to write another function that sends each line over the network via network sockets. The beauty of the scanner function as defined is that it works the same regardless of the callback function passed to the func argument. This programming technique is sometimes known as programming to a behavior.

Opening and Reading Files

The final topic covered is opening and reading a file. Line 5 in the script uses the with statement combined with the open function to actually open the file in read mode. The as f assigns the result of the open function to the variable f. The f variable holds a Python file object.

Since Python file objects support the iterator protocol, they can be used in for loops. On line 7, we read through each line in the file with the statement for line in f:. On each execution of the loop, the line variable is updated with the next line in the file.

When the loop is complete, the with statement calls the file’s close() method automatically, even if there is an exception. Of course, Python’s garabage collection will also ensure a file is closed, but this pattern provides an extra level of safety, especially since there are a variety of Python interpretors that may act differently than the CPython.

Conclusion

The most powerful take away from this example if the first class functions. Python treats functions like any other data type. This allows functions to be stored as passed around the program as required. Using first class functions keeps code loosely coupled and highly maintanable!

Sources

Lutz, Mark. Programming Python. Beijing, OReilly, 2013.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: