Python Command Line Arguments

The sys module provides developers with an access point to command line arguments. Here is an example program that prints command line arguments to the console.

import sys

# The sys object has an argv field that is a
# list of the command line arguments passed to the program
# The first entry is the name of the script
print(sys.argv)

Here is the program’s output when run from the command line.

Patricks-MacBook-Pro:system stonesoup$ python testargv.py
['testargv.py']

The sys.argv object is a list of all command line arguments supplied to the script when run as a python program. Generally speaking, the first entry in the list is the name of the script. For more information on Python, see Programming Python: Powerful Object-Oriented Programming

Python Page Through A File

Many operating systems have command line tools that allow a user to page through a file in chunks. As a demonstration of how to read text files in Python, I used an example from Programming Python: Powerful Object-Oriented Programming.

Code

def more(text, numlines=15):
    # This splits the text into a list object based on line
    # endings
    lines = text.splitlines()

    # Now continue to loop until we are out of lines
    while lines:
        # Slice off numLines into chunk
        chunk = lines[:numlines]
        
        # Remove numLines from the beginning of lines
        lines = lines[numlines:]

        # Now loop through each line in chunk
        for line in chunk:
            # and then print a line
            print(line)
            
        # Now ask the user if we want to keep going
        if lines and input('More?') not in ['y', 'Y']:
            break

if __name__ == '__main__':
    # Import sys so that we can read command line arguments
    import sys
    
    # Next, we are grabbing the first argument from the
    # command line, and passing it the open function
    # which returns a file object. Calling read on this
    # object will dump the contents of the file into a String
    # which gets passed to our more function above
    more(open(sys.argv[1]).read(), 10)

Detailed Explanation

The comments in the code above are mine and explain what is going on in the program. The program starts by testing if this script is getting called as a standalone program or if we are importing this code as a module.

Assuming this is a standalone program, we import the sys module so that we can examine the command line arguments. The second command line argument needs to be a text file or this program will crash. We pass the name of the file to the open function, which returns a file object. Calling read() on the file object dumps the entire contents of the file into a String.

At this point, we pass the string into our more() function. It starts out by splitting the string by lines, which returns a list object. We start to loop through this list object, which continues until the list is empty.

Inside of the while loop, we slice off numLines from lines and store then in chunk. Then we remove those lines from the lines list. The next step is to print out each line in chunk. Once that is complete, we test if we still have more lines to print and if we do, we ask the user if they want to keep going or exit.

Here is the program output when run on my screen.

Patricks-MacBook-Pro:System stonesoup$ python more.py more.py
def more(text, numlines=15):
    lines = text.splitlines()

    while lines:
        chunk = lines[:numlines]
        lines = lines[numlines:]

        for line in chunk:
            print(line)
        if lines and input('More?') not in ['y', 'Y']:
More?y
            break

if __name__ == '__main__':
    import sys
    more(open(sys.argv[1]).read(), 10)

Python Current Working Directory

Many programs have a need to figure out the current working directory (CWD) at runtime. The Python os package has a getcwd() function that returns a program’s CWD. This is an example taken from Programming Python: Powerful Object-Oriented Programming

Code

import os, sys

# This prints the current working directory
print('my os.getcwd =>', os.getcwd())

# This prints the system path
print('my sys.path =>', sys.path[:6])

input('Press any key to exit')

Explanation

There isn’t much going on in this program. The first line imports the os and sys modules. The next line calls the print statement and passes the value returned from os.getcwd(). That will print the current working directy.

The next line prints the system paths, limited to 6 paths. Finally there is an input statement that causes the program to wait until the user presses a key to exit the program.

Output

Here is the output when ran on my system.

my os.getcwd => /Users/stonesoup/IdeaProjects/ProgrammingPython/PP4E/System
my sys.path => ['/Users/stonesoup/IdeaProjects/ProgrammingPython/PP4E/System', '/Users/stonesoup/Library/Application Support/IntelliJIdea2017.2/python/helpers/pydev', '/Users/stonesoup/IdeaProjects/ProgrammingPython', '/Users/stonesoup/IdeaProjects/ProgrammingPython/PP4E', '/Users/stonesoup/IdeaProjects/ProgrammingPython/PP4E/System', '/Users/stonesoup/Library/Application Support/IntelliJIdea2017.2/python/helpers/pydev']
Press any key to exit

Walk a Filetree in Python

Python has a powerful os.walk function that let’s a script walk through a file system in an efficient fashion. In this example, taken from Programming Python: Powerful Object-Oriented Programming, we will walk a file tree that will remove any p-code files that are present in the file tree.

Code

Here is the code, with my comments added.

import os, sys

# Do we only want to find files only?
findonly = False

# Either use the CWD or a directly specified by command line arguments
rootdir = os.getcwd() if len(sys.argv) == 1 else sys.argv[1]

# Keep track of the found and removed files
found = removed = 0

# Walk through the file tree
for (thisDirLevel, subsHere, filesHere) in os.walk(rootdir):

    # Go through each file in the directory
    for filename in filesHere:

        # Check if it ends with .pyc
        if filename.endswith('.pyc'):

            # Assemble the full file name
            fullname = os.path.join(thisDirLevel, filename)
            print('=>', fullname)

            # Attempt to remove the file if asked to do so
            if not findonly:
                try:
                    # Attempt to delete the file
                    os.remove(fullname)

                    # Increment the removed count
                    removed += 1
                except:
                    # Handle the error
                    type, inst = sys.exc_info()[:2]

                    # Report that this file can't be removed
                    print('*'*4, 'Failed:', filename, type, inst)
            found += 1

# Output the total number of files removed
print('Found', found, 'files removed:', removed)

Detailed Explanation

This script functions in a findonly or remove mode. So the first variable we create on line 4 is a flag that decides if we are only looking for p-code files or if we are finding and removing such files. Next we create a rootdir varaible that is either the current working directory or a directory supplied by a command line argument. We create two variables on line 10, found and removed, which track how many files we have found and removed.

We get into the meat of the program on line 13 when we enter into a loop that iterates over os.walk. The os.walk function takes a directory path to start at and then goes through every single subdirectory in that file tree. It’s the standard way to walk a file tree in python. The function returns a tuple that includes the directory the os.walk function is currently examining, the number of subdirectories, and the number of files.

We create a nested loop on line 16 so that we can look at each file in the directory individually. On line 19, we check if the file ends with the .pyc extension. If it does, we use os.path.join to assemble a full file path in a platform agnostic fashion and then print out the full file path to the console.

If we are deleting files, we use os.remove on line 29 to attempt to delete a file. It’s critical that we wrap this in a try block because we may not hvae permission to delete the file. If deleting the file is successful, we increment the removed count. If it fails, the program execution will jump to line 35 and we report the error. The loop ends on line 39 and then repeats.

When the program is finished, we report how many files we found and removed.

Functions—Python

All computer programming languages allow developers to seperate code into reuseable pieces of code called functions. Functions are critical because they allow us to generalize pieces of work into a block of code and reuse that code as many times as needed. When designed well, functions improve code readability by cutting down on the length of the code. We can also debug our code easier because we only have to look in on place for a bug rather than several places.

Demonstration

Let’s begin with a function demonstration.

# A function
def nested_func():
    print('Inside of nested_func()')
    return 'value'


# A function
def func():
    print('Inside of func()')

    val = nested_func()
    print('Back in func(). nest_func() returned {}'.format(val))


# Not a function
if __name__ == '__main__':
    print('Outside of all functions. Calling func()')
    func()
    print('Back from our functions')

This is a block of code that creates two functions. When we run the code, we get this output.

Outside of all functions. Calling func()
Inside of func()
Inside of nested_func()
Back in func(). nest_func() returned value
Back from our functions

Computer programs normally run from top to bottom one line at a time. In the case of this program, the program doesn’t start until we reach if __name__ == '__main__':. This is because Python runtime isn’t going to execute the code inside of nest_func() or func() until the functions are called.

When the code reaches 17, it executes the print statement. The next line, 18, is our first call to a function. We named our function func() in this case. Calling func() causes the program’s execution to jump up to line 9. Once we are at line 9, Python executes the print statement and then moves onto the next line in the function, line 11.

Line 11 creates a variable called val and then calls our next function, nested_func(). The nested_func() function is a function that returns a value. Program execution moves to line 3. Line 3 executes the print statement, and then line 4 returns a String value. The program execution returns back to line 11.

At this point, the variable val has a value stored in it. The program’s execution goes to line 12 and the print statement is executed. Now the program exits the func() function and control returns to line 19. The program executes the final print statement found on line 19 and then exits.

Defining a function

You create functions in Python by using the def keyword followed by the name of the function. After the name of the function, you have an opening parentheses ( followed by a closing parenthese ). You can place any number of variables inside of the parentheses. Here is an example

# Function with arguments
def func(val, val2):
    print(val)
    print(val2)

# Calling the function
func('Hello', 'World')

The name of this function is func. It has two arguments, val and val2. After the colon, you can include any number of statements you would like inside of the function. The function is now a seperate unit of code at this point. This function get’s called by func('Hello', 'World'). Anytime the Python interpreter something like this, it will execute all of the statements inside of func. We do not have to use ‘Hello’ or ‘World’ as the argument either. It’s perfectly ok to do something like func(47, 'Thunderbiscuit').

Optional Arguments

We can specify default values to our functions.

def some_func(arg1='Mickey'):
    print(arg1)

# Prints 'Mouse'
some_func('Mouse')

# Prints 'Mickey'
some_func()

Since this function has optional arguments, we can either pass it our own argument, or we can just use the default. The first call passes ‘Mouse’ to some_func, in which case arg1 = ‘Mouse’. The second call does not specify a value, so arg1 gets the default ‘Mickey’ value.

Lists—Python

Lists are a sequence type object that Python provides to us for grouping data together into a single variable. Let’s consider a common application of a list before we go into details.

names = ['Bob Belcher',
         'Linda Belcher',
         'Tina Belcher',
         'Gene Belcher',
         'Louise Belcher']
for name in names:
    print (name)

This code creates a list called names and populates it with five names. Then it prints the names to the console. We could accomplish the same output by using this code.

bob = 'Bob Belcher'
linda = 'Linda Belcher'
tina = 'Tina Belcher'
gene = 'Gene Belcher'
louise = 'Louise Belcher'

print(bob)
print(linda)
print(tina)
print(gene)
print(louise)

A quick comparison shows that the first example is not only much easier to read, but it is also more maintainable. If we want to print additional names, we only need to add them to the names list in the first example. However, in the second example, we need to add new name variable and another print statement. This may not seem like a big deal with six names, but obviously 6,000 is a much different case.

It’s for this reason that almost every programming language provides some sort of a collection object. List is one such data type that Python provides out of the box. Let’s look at common list operations.

Check if item is in the list

We may want to check if a list has a certain item.

names = ['Bob Belcher',
         'Linda Belcher',
         'Tina Belcher',
         'Gene Belcher',
         'Louise Belcher']
if 'Bob Belcher' in names:
    print('Found Bob')

This code prints ‘Found Bob’ because 'Bob Belcher' in names returns True.

Check if item is not in list

We can also check if a list does not have an item.

names = ['Bob Belcher',
         'Linda Belcher',
         'Tina Belcher',
         'Gene Belcher',
         'Louise Belcher']
if 'Teddy' not in names:
    print('No Teddy here!')

This code would print ‘No Teddy here!’ because 'Teddy' not in names is True. Our names list does not have ‘Teddy’

Combine lists

We can add two lists together (called concatenation) using the + operator.

belchers = ['Bob Belcher',
            'Linda Belcher',
            'Tina Belcher',
            'Gene Belcher',
            'Louise Belcher']

pestos = ['Jimmy Pesto',
          'Jimmy Pesto Jr.',
          'Andy Pesto',
          'Ollie Pesto']

family_frackus = belchers + pestos
print(family_frackus)

This code will print all of the names to standard out when run.

Accessing items

Lists use the index operator access items

belchers = ['Bob Belcher',
            'Linda Belcher',
            'Tina Belcher',
            'Gene Belcher',
            'Louise Belcher']
print(belchers[0])
print(belchers[3])

Keep in mind that lists are 0 based, so belchers[0] is ‘Bob Belcher’ while belchers[3] is ‘Gene Belcher’

Add item to list

We can add an item to a list using the append() method.

belchers = ['Bob Belcher',
            'Linda Belcher',
            'Tina Belcher',
            'Gene Belcher',
            'Louise Belcher']
belchers.append('Teddy')
print(belchers[5])

Remove item from a list

We use the del operator to remove an item from a list

belchers = ['Bob Belcher',
            'Linda Belcher',
            'Tina Belcher',
            'Gene Belcher',
            'Louise Belcher']
del belchers[0]

Using del belchers[0] removes ‘Bob Belcher’ from the list.

Replace item in a list

We can replace an item in a list by specifying the index and assigning a new value to it.

belchers = ['Bob Belcher',
            'Linda Belcher',
            'Tina Belcher',
            'Gene Belcher',
            'Louise Belcher']
belchers[0] = 'Mort'

The code belchers[0] = 'Mort' replaces ‘Bob Belcher’ with ‘Mort’

Length of the list

We get the length of the list using len.

belchers = ['Bob Belcher',
            'Linda Belcher',
            'Tina Belcher',
            'Gene Belcher',
            'Louise Belcher']
print(str(len(belchers))) 

The print(str(len(belchers))) prints 5 to the console.

Conclusion

We can do a lot more with lists than what was discussed in this post. Make sure you check out the Python documentation for a complete list of features!

Strings—Python

Python has native support for Strings (which is basically text). Many computer programs need to process text data and Python’s string type has powerful features to make the lives of developers easy!

String Literals

Here are a few examples of how to create strings in Python using literals.

sq = 'String made with Single Quotes'
dq = "String made with double quotes"
tq = """String with triple quotes"""

Double quote strings let you embed an apostrope character without escaping it. So you can write “I’m a cat” as a string literal in Python. Triple quote strings allow white space and line breaks in the string.

Individual Characters

Python strings support the index operator, so you can access characters in a string using [n] where n is the position of the character you wish to access. Here are a few ways to process strings by individual characters.

kitties = 'I like kitties'

# Access by index
for i in range(0, len(kitties)):
    print(kitties[i])

# Access with iteration
for c in kitties:
    print (c)

Useful Methods

The Python string class has many useful methods. You can view them all at Python documentation. Here are some of the ones I use the most.

islower()

This checks if a string is all lower case characters.

kitties = 'i like kitties'
if kitties.islower():
    print(kitties, ' is lower case')
else:
    print(kitties, ' is not lower case')

lower()

Python Strings are immutable, but you can use lower to convert a string to all lower case.

kitties = 'I like kitties'

if not kitties.islower():
    kittens = kitties.lower()
    print(kittens) 

isupper()

This method tests if a string is all upper case.

kitties = 'I LIKE KITTIES'

if kitties.isupper():
    print(kitties, ' is upper case')
else:
    print(kitties, ' is not upper case')

upper()

This method converts the string to upper case.

kitties = 'i like kitties'

if not kitties.isupper():
    cat = kitties.upper()
    print(cat)

isnumeric()

Checks if the strings is a numbers. This is useful if you want to convert a string to an int.

number_str = '3'

if number_str.isnumeric():
    number = int(number_str)

format()

This is useful for using a generic string that allows you to replace {} with values.

fm = 'I have {} kitties'
print(fm.format(3')
# Prints: I have 3 kitties

Numbers—Python

Computer programs need to track numberic data every single day. You may want to track balances in bank accounts, scientific numbers, or just counters. Python gives us a wide variety of numeric processing types. Here are some of the more common ones:

  • Integers
  • Floats
  • Boolean
  • Decimals
  • Fractions

Integers

Integers are whole numbers that are either positive or negatives. Unlike langues such as Java or C++ (or Python 2.x) Python does not distinguish between regular and long integers. Here are some examples of declaring integers

positiveNumber = 123456
negativeNumber = -111

Floating Point Numbers

Floating point numbers are numbers that have decimal points or numbers in scientific notations. They can be positive or negative.

dollar = 3.15
pi = 3.14159
sci = 8.2e10

Octal, hex, binary

Here are examples of numbers in octal, hexadecimal, and binary numbers.

oct = 0o237
hex = Ox9F
binary = Ob10011111

Decimals and Fractions

Python provides us with Decimal and Fraction data types which maintainer percision with decimal points. A loss of percision may not get noticed in trival programs, but when working with big datasets, the loss of percision can introduce major bugs in the program.

d = Decimal('1.59')
f = Fraction(1, 3) # Numerator / denominator

Boolean

Booleans hold true and false values.

b = True
f = False

For more information

You can learn more by visiting the python documentation.

Python Unit Testing

Unit testing is a critical portion of any significant software project. Althougth adding unit tests increases the size of your project’s code base, well written unit tests let us maintain confidence in our code base.

Well designed code should work well as stand alone or mostly stand alone software components. This is true of both procedural code and OOP. Unit tests test these components to make sure they continue to work as expected. It helps development because if a software components breaks an expected interface or starts behaving in an expected fashion, we will know about the issue prior to building or deploying our application.

Many developers (including myself) prefer to know about bugs before users see them. Writing good units are one of many tools that help us catch bugs before they make it out into production code. This post will walk us through Python’s unit testing framework.

Example Test Class and Unit Test

Let’s start by creating a class that we are going to unit test. We are going to make a Greeter class that takes a Gender enumeration and a Greeter class.

from enum import Enum


class Gender(Enum):
    MALE = "m"
    FEMALE = "f"


class Greeter:
    def __init__(self, name, gender):
        self.name = name
        self.gender = gender

    def greet(self):
        if self.gender == Gender.MALE:
            return 'Hello Mr. {}'.format(self.name)
        elif self.gender == Gender.FEMALE:
            return 'Hello Ms. {}'.format(self.name)

What we are expecting the Greeter.greet method to do is print a greeting that contains either Mr or Ms depending on the gender. Let’s make a test that makes sure we are getting the correct output.

import unittest


class TestGreeter(unittest.TestCase):
    def test_greet(self):
        # Create a Greeter Object to test
        greeter_male = Greeter('Jonny', Gender.MALE)

        # Now check it is working properly
        self.assertTrue('Mr' in greeter_male.greet(), 'Expected Mr')

        # Now check female
        greeter_female = Greeter('Jane', Gender.FEMALE)
        self.assertTrue('Ms.' in greeter_female.greet(), 'Expected Ms')

if __name__ == '__main__':
    # This invokes all unit tests
    unittest.main()

We get the following output in the console. It’s pretty boring.

Ran 1 test in 0.002s

OK

Catching Bugs

Our first test is what we see if we have a working class and unit test. It’s very boring and we want boring. In many software projects, we tend to change things as we add new features, fix bugs, or improve code. If our changes break things in our code base, we want our unit tests to tell us about the issue.

Let’s make a small change in our Greeter class

class Greeter:
    def __init__(self, name, gender):
        self.name = name
        self.gender = gender

    def greet(self):
        if self.gender == Gender.MALE:
            return 'Hello Mr. {}'.format(self.name)
        elif self.gender == Gender.FEMALE:
            # Changed Ms. to Mrs.
            return 'Hello Mrs. {}'.format(self.name)

Now let’s run our test and see what happens

Failure
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/unittest/case.py", line 601, in run
    testMethod()
  File "/Users/stonesoup/PycharmProjects/stonesoupprogramming/unit_test_demo.py", line 35, in test_greet
    self.assertTrue('Ms.' in greeter_female.greet(), 'Expected Ms')
  File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/unittest/case.py", line 678, in assertTrue
    raise self.failureException(msg)
AssertionError: False is not true : Expected Ms


Ran 1 test in 0.013s

FAILED (failures=1)

If you are looking closely, we changed Ms. to Mrs. in our greeting. Given how small the change, it’s really easy for our human eyes to overlook the change and anyone can image how easy it would be for this bug to make it into production. Since our unit test is well written, we know about this bug right away!

If we did want our message to print Mrs rather than Ms, we need to update our unit test. That’s a good thing because it makes us think about how changes to our code impact the code base in general. Unit tests are so helpful that many developers have even adopted to “Test Driven Development” programming discipline.

You can learn more about Python’s unit testing framework at here.

Enumerations—Python

Enumerations are a way to group constants together and improve code readibility and type checking. Here is an example of an enumeration in Python.

from enum import Enum
from random import randint


class Color(Enum):
    RED = 1
    BLUE = 2
    GREEN = 3


def pick_color():
    pick = randint(1, 3)
    if pick == Color.RED.value:
        return Color.RED
    elif pick == Color.BLUE.value:
        return Color.BLUE
    elif pick == Color.GREEN.value:
        return Color.GREEN


def print_color(color):
    if color == Color.RED:
        print('Red')
    elif color == Color.GREEN:
        print('Green')
    elif color == Color.BLUE:
        print('Blue')


if __name__ == '__main__':
    color = pick_color()
    print_color(color)

Python enumeration extend the enum class. After inheriting from enum, we just list out the values in our enumeration and assign them constants.

The pick_color() function returns a randomly picked enumeration. We then pass that value to print_color().

You’ll notice that print_color accepts a color object and does comparisons against the values of the Color enumeration. You can see that the code is much more readible (and also more robust) than using literals such as 1, 2, or 3 in our code. The other nice aspect of using an enumeration is that we can change the values of our constants without breaking code and we can add more constants if needed.