Kotlin Access Modifiers

Encapsulation is a huge part of OOP. Hiding method and behavior is an important part in achieving encapsulation so Kotlin provides us with access modifiers to help in this effort.

Kotlin has four kinds of access modifiers

  • private—Items marked as private are visible only to the class
  • protected-Items marked as protected are visible to the class and it’s child classes
  • internal—Items marked as internal are accessible to all members of the Kotlin module but are not available outside of the module
  • public—Items marked as public are available to the module and outside of the module using the import statement
    1. Let’s walk through some examples.

      Public

      Public is the default visiblity in Kotlin. If the public keyword is omitted, the member will be marked as public.

      fun printBurgerOfTheDay(burgerName : String = "Never been Feta")
              = println(burgerName)
      

      The printBurgerOfTheDay function is accessible to all code within the module that contains it, plus any module that imports the printBurgerOfTheDay function.

      Internal

      Members marked as internal are visible to the module but are not accessible outside of the module. In other words, internal members act like public members, but they can’t be imported into other modules.

      internal var burgerName = "Mission A-Corn-Plished Burger"
      

      The above variable can be accessed in the module. It may not be imported into another module.

      Protected

      While public and internal access can be used on both class and non-class members, protected only applies to classes.

      open class Cook {
          protected val position = "Cook"
          private val name = "No Name"
      
          override fun toString(): String {
              return position + ", " + name
          }
      }
      
      class Bob : Cook() {
      
          fun printPosition() = println(position)
      }
      

      In the above example, we have a Cook and a Bob class. The Cook class has a protected property called position. The position property is accessible to Cook but it’s also accessible to Bob because Bob is a child class of Cook. In the above example, Bob has a printPosition() function that uses the position property.

      Private

      Private access is the most restrictive. When a member is marked as private, it is only accessible to the class. In the above code snippet, Cook also has a name property, but the name property is not accessible to Bob because it is private.

      Putting it all together

      Below is an example program that shows all of the possible access modifiers in question.

      package ch1.accessmodifiers
      
      /**
       * When no access modifier is used, public is used by default.
       * This printBurgerOfTheDay function is visible to the entire program
       */
      fun printBurgerOfTheDay(burgerName : String = "Never been Feta")
              = println(burgerName)
      
      /**
       * This extension function is marked private. It can't even be used in this module
       */
      private fun String.makeBurgerOfTheDay(burgerName : String) : String = burgerName
      
      /**
       * This variable is marked as internal. It's visible throughout the module,
       * but it can't be accessed outside of the module
       */
      internal var burgerName = "Mission A-Corn-Plished Burger"
      
      /**
       * The position property on Cook is marked as protected. It is only accessible
       * to the Cook class an it's child classes. The name property is private and may
       * only be used by Cook.
       */
      open class Cook {
          protected val position = "Cook"
          private val name = "No Name"
      
          override fun toString(): String {
              return position + ", " + name
          }
      }
      
      class Bob : Cook() {
      
          fun printPosition() = println(position)
      }
      
      fun main(args : Array<String>){
          //Using the public function printBurgerOfTheDay
          printBurgerOfTheDay()
      
          //Printing the internal burgerName variable
          printBurgerOfTheDay(burgerName)
      
          //Create an instance of Cook
          val cook = Cook()
          println(cook)
      
          //Create an instance of Bob
          val bob = Bob()
          bob.printPosition()
      
          //try and use the String extension function
          //DOESN'T COMPILE because makeBurgerOfTheDay is private
          //burgerName = String.makeBurgerOfTheDay("Rest in Peas Burger")
      }
      

      Here is the output when run

      Never been Feta
      Mission A-Corn-Plished Burger
      Cook, No Name
      Cook
      

      References

      https://kotlinlang.org/docs/reference/visibility-modifiers.html

Kotlin Constructors

Many OOP languages have special code that is used to initialize a class to a valid state. Such code is referred to as a constructor method. The constructor runs when an instance of a class is created.

Like all methods, constructors can have zero or more parameters. Let’s consider the constructor found in the ArrayList class found in java.util.collections.

ArrayList<?> list = new ArrayList<>(); //Default constructor
ArrayList<?> list2 = new ArrayList<>(list); //Secondary constructor

The above Java code snippet demonstrates multiple constructor. The first constructor is called the default constructor and accepts no arguments. It creates an empty ArrayList object. The other constructor takes an existing collection object and automatically adds all objects contained in the collection passed into the constructor into the new list. Let’s walk through the various forms of constructors we can define in Kotlin.

Default Constructors

When no constructor is specified, Kotlin will supply an empty no argument constructor.

class ChalkBoard {
    val message = "New Baconings"
}

val chalkBoard = ChalkBoard()
println(chalkBoard.message)

The ChalkBoard class has no constructor. When we create a ChalkBoard object, we just use the () for the default constructor.

Constructor with Required Parameters

We can also define constructors that force us to use valid data.

class Bob(val position : String)

val bob = Bob("Cook")
println(bob.position)

Notice how the Bob class has a position parameter of type String. When we create an instance of Bob, we have to supply a String to the constructor. There is no default constructor for Bob. Since we used the val keyword in front of the position argument, Kotlin created a position property on Bob. This lets use print Bob’s position in the println statement.

Constructor with Initialization Block

We aren’t limited to just setting properties in a constructor. Kotlin also lets us create an initialization block that runs immediately after our properties have been set.

class Linda(val position : String){
    //This is the initialization block that runs when an instance
    //of this class is created
    init {
        check(position.isNotBlank(), { "Linda needs a job!" })
        println("Linda was created with $position")
    }
}

val linda = Linda("Cook's wife")
println(linda.position)

The above code still set’s Linda’s position property, but when the constructor runs, it will also print “Lnda was created with [specified position]” because of the init block found immedialy after the constructor as long was position is not an empty String. If position is blank, the init block would throw an exception with the message “Linda needs a job!”. Init blocks can do other things besides validation, but validation is certainly a common use case of the init block.

Property Initializers

Kotlin also provides property initialization blocks. Let’s consider Gene.

class Gene(position : String){
    //This is the property initialization section
    val beefSquash = position.toUpperCase()
}

val gene = Gene("Beefsquash")
println(gene.beefSquash)

Gene has a constructor that takes a position, but notice that there is no val or var keyword prior to position. Since the var/val keyword is missing, Kotlin does not generate a position property for Gene. On the first line of the class body, we have val beefSquash = position.toUpperCase(). The code creates a beefSquash property on Gene and sets it to the upper case value of position. This lets us create properties outside of the constructor and initialize them to the arguments found in the constructor.

Access modifiers on Constructors

A common OOP pattern is to create private constructors and use factory or builder objects to create an instance of a class. Our Tina class shows a stripped down example of the factory pattern.

class Tina private constructor(val position : String){

    //We have to use a companion object to make getInstance() visible
    companion object {
        //We can only use the constructor from within Tina because it's private
        fun getInstance(position : String) = Tina(position)
    }
}

//Can't do this because Tina's constructor is private
//val tina = Tina("Cook");

//We can do this
val tina = Tina.getInstance("Itchy Grill Cook")
println(tina.position)

Whenever we need to add access modifiers or annotations to a constructor, the constructor keyword is required. In this case, we have private constructor in front of the parameters of Tina’s constructor. Only code found within the Tina class may use the constructor because private restricts the visibility of the constructor. Tina objects can only be created by invoking the getInstance() method which creates and returns a Tina object.

Multiple Constructors

Koltin doesn’t limit us to using only constructor. We are actually free to have as many constructors as we need.

class Louise (val position: String){

    //Calling this(...) after the colon but before the { will invoke the
    //first constructor
    constructor(position: String, age : Int): this(position){
        println("Inside secondary constructor with $position and $age")
    }
}
//Call the single argument constructor
val lousieSingle = Louise("Bunny Ears")

//Call the secondary constructor
val louise = Louise("Bunny Ears", 10)
println(louise.position)

Louise has a regular constructor, but inside of the class body, we can define additional constructors by using the constructor keyword and then specifying our arguments. We are free to reuse code between our constructors by using the this keyword followed by the parameters of the constructor we wish to use. Kotlin will use function overloading to resolve to the correct constructor (or issue a compiler error if one can’t be found). In the example, we see two different ways to create a Louise object. We can use her primary constructor, or her secondary constructor, which also print “Inside secondary constructor with $position and $age” to the console.

Optional Arguments

The final constructor technique involves using optional arguments to specify a default value if a parameter is ommitted. Let’s take a look at Teddy.

class Teddy (val position : String = "customer", val favoriteFood : String = "burger")

val teddyDefault = Teddy()
println(teddyDefault.position + ", " + teddyDefault.favoriteFood)

val teddyArguments = Teddy(position = "Best Customer", favoriteFood = "Burger of the Day")
println(teddyArguments.position + ", " + teddyArguments.favoriteFood)

val teddyFood = Teddy(favoriteFood = "Burger of the day")
println(teddyFood.position + ", " + teddyArguments.favoriteFood)

val teddyPosition = Teddy(position = "Best Customer")
println(teddyPosition.position + ", " + teddyArguments.favoriteFood)

We actually have four constructors for Teddy. The first one is a no argument constructor that initializes Teddy’s poistion to customer and his favoriteFood to burger. The other constructor let’s us specify Teddy’s posistion and his favoriteFood. The third constructor let’s use specify Teddy’s favoriteFood and use his default position. Finally we can use the constructor that specifies Teddy’s position, but use the default for favorite foods.

Putting it all Together

package ch1.constructors

class ChalkBoard {
    val message = "New Baconings"
}

/**
 * Kotlin class with an empty constructor
 */
class Bob(val position : String)

/**
 * Kotlin class with initialization block
 */
class Linda(val position : String){
    //This is the initialization block that runs when an instance
    //of this class is created
    init {
        check(position.isNotBlank(), { "Linda needs a job!" })
        println("Linda was created with $position")
    }
}

/**
 * Kotlin class with a property initializer
 */
class Gene(position : String){
    //This is the property initialization section
    val beefSquash = position.toUpperCase()
}

/**
 * Kotlin class with private constructor
 */
class Tina private constructor(val position : String){

    //We have to use a companion object to make getInstance() visible
    companion object {
        //We can only use the constructor from within Tina because it's private
        fun getInstance(position : String) = Tina(position)
    }
}

/**
 * Kotlin class with multiple constructors
 */
class Louise (val position: String){

    //Calling this(...) after the colon but before the { will invoke the
    //first constructor
    constructor(position: String, age : Int): this(position){
        println("Inside secondary constructor with $position and $age")
    }
}

/**
 * Kotlin class that has a constructor with optional arguments. We can use one,
 * both, or none of the arguments when we create an instance of Teddy
 */
class Teddy (val position : String = "customer", val favoriteFood : String = "burger")

fun main(args : Array<String>){
    val chalkBoard = ChalkBoard()
    println(chalkBoard.message)

    val bob = Bob("Cook")
    println(bob.position)

    val linda = Linda("Cook's wife")
    println(linda.position)

    val gene = Gene("Beefsquash")
    println(gene.beefSquash)

    val tina = Tina.getInstance("Itchy Grill Cook")
    println(tina.position)

    val louise = Louise("Bunny Ears", 10)
    println(louise.position)

    val teddyDefault = Teddy()
    println(teddyDefault.position + ", " + teddyDefault.favoriteFood)

    val teddyArguments = Teddy(position = "Best Customer", favoriteFood = "Burger of the Day")
    println(teddyArguments.position + ", " + teddyArguments.favoriteFood)

    val teddyFood = Teddy(favoriteFood = "Burger of the day")
    println(teddyFood.position + ", " + teddyArguments.favoriteFood)

    val teddyPosition = Teddy(position = "Best Customer")
    println(teddyPosition.position + ", " + teddyArguments.favoriteFood)

}

Here’s the output

New Baconings
Cook
Linda was created with Cook's wife
Cook's wife
BEEFSQUASH
Itchy Grill Cook
Inside secondary constructor with Bunny Ears and 10
Bunny Ears
customer, burger
Best Customer, Burger of the Day
customer, Burger of the Day
Best Customer, Burger of the Day

References

https://kotlinlang.org/docs/reference/classes.html

Kotlin Polymorphism

Polymorphism allows computer code to become contextual. In other words, a computer instruction can take on different meanings depending on situation in which the instruction is used. This is no different than how we speak. A person can use the same word, ‘there’, ‘they’re’, or ‘their’ to mean different things even though all three words are said the same way.

Kotlin supports two forms of polymorphism because it is both strongly and statically typed. The first form of polymorphism happens when the code is compiled. The other form happens at runtime. Understanding both forms of polymorphism is critical when writing code in Kotlin.

Compile Time Polymorphism

Let’s consider the following code snippet.

fun printNumber(n : Number){
    println("Using printNumber(n : Number)")
    println(n.toString() + "\n")
}

fun printNumber(n : Int){
    println("Using printNumber(n : Int)")
    println(n.toString() + "\n")
}

fun printNumber(n : Double){
    println("Using printNumber(n : Double)")
    println(n.toString() + "\n")
}

We have three functions all of which have the same name and return type (void). As a matter of fact, the only difference in the signature of these methods is the type of parameter that is used. The first printNumber accepts a Number variable. The second one accepts an Int variable, and the final one accepts a Double variable. This technique is called function overloading.

Since all functions have the same name, how do we know which one will be used when we have code such as the snippet below?

val a : Number = 99
val b = 1
val c = 3.1

printNumber(a) //Which version of printNumber is getting used?
printNumber(b) //Which version of printNumber is getting used?
printNumber(c) //Which version of printNumber is getting used?

The compiler sees three functions with the same name and knows it ultimately has to choose a version of printNumber to use on each statement. Since all three functions have the same name, the compiler turns to context clues to deduce which printNumber function should be used in each case. The variable ‘a’ is explicity declared as type Number. Since there is a version of printNumber that accepts a Number parameter, that version of printNumber is chosen.

The next variable, b, is assigned to one. The number one is an integer, and since we didn’t tell the compiler to make it a Number, it defaults to an Integer type. When printNumber(b) is called, the compiler matches to the version of printNumber that accepts type Integer. The final variable, ‘c’, is initialized to 3.1. The default type for that number is Double so the compiler chooses printNumber(Double) as the correct version to call when ‘c’ is passed as an argument.

It should be noted that when the compiler can’t correctly pick a method, it will issue an error. We are free to overload functions as long as the method signatures are unique enough for the compiler to figure out which function should be used. So for example, functions may have the same return type provided that they take different parameters or a different number of parameters. What isn’t allowed is for two functions to have the same parameters but different return types. In that case, the compiler is unable to figure out which function to use and it will issue an error.

Function overloading is a powerful technique that allows us to write flexible code. The main use case is to write functions that have different behaviors based on the type of object. Imagine writing a function that converts objects to JSON (a form of data exchange). Since every object can be different, we would need different implementations of a functions to correctly output JSON for each type of object. However, it doesn’t really make sense to write things like intToJson, arrayToJson, or carToJson. We can call of these functions by the same name: toJson, and just use polymorphism to call the proper implementation.

Runtime Polymorphism

Runtime polymorphism happens when the program is running. Since Kotlin is an OOP language, we can use classes and interfaces to refer to objects. For example, Number is a class while Integer and Double are both child classes of Number. That it is acceptable to use Integer or Double variables anywhere in our code where Number is expected.

However, things go even futher than using child classes as values for a base class. When a child class overrides a method defined in a base class, the program will use the child class’s method rather than the base class. This is called virtual methods. Let’s consider Number’s toDouble() method.

The toDouble() method is defined in Number, which means all child classes of toDouble() have some sort of an implementation of toDouble(). When we call toDouble() on an Integer, the program knows to use the toDouble() defined in Integer so that the output of toDouble() makes sense for an Integer. Likewise where we to call toDouble() on a BigDecimal object, we would also get the version of that method defined in BigDecimal.

In our code example, let’s consider summing all of our number variables.

fun sum(numbers : List<Number>) : Number {
    return numbers.sumByDouble { it.toDouble() }
}

fun main(args : Array<String>){
    val a : Number = 99
    val b = 1
    val c = 3.1

    //Using runtime polymorphism
    println("Summing all numbers")
    println(sum(listOf(a, b, c)))
}

Since the varaibles ‘a’, ‘b’, and ‘c’ are all either types of Number or child classes of Number, we can use them in the sum function. Inside of the function, we call it.toDouble() to convert the current number into a double. One of those numbers is the variable ‘b’ which is an Integer. So when b.toDouble() gets called, the version associated with Integer is used. However, when c.toDouble() gets called, the version of toDouble() associated with Double gets used.

Putting it together

Both forms of polymorphism allow for highly flexible code. Well designed computer code should be written with generalization in mind. By targeting a base class or an interface, we can reuse the same code with different types of objects by using polymorphism. Likewise, function overloading improves the readability of code and it’s maintainability because we can call a function with the same name as other functions and trust that the proper function is used depending on the context.

Example Program

/**
 * The first 3 functions, all called printNumber, demonstrate function
 * overloading. This is polymorphism that is determined at compile time.
 * Basically, the compiler knows which function to use based on the type of the
 * input parameter n. So if n is an Int, it will use printNumber(n : Int).
 * If n is a double, it will use printNumber(n : Double). For all other numbers,
 * it will use printNumber(n : Number).
 */
fun printNumber(n : Number){
    println("Using printNumber(n : Number)")
    println(n.toString() + "\n")
}

fun printNumber(n : Int){
    println("Using printNumber(n : Int)")
    println(n.toString() + "\n")
}

fun printNumber(n : Double){
    println("Using printNumber(n : Double)")
    println(n.toString() + "\n")
}

/**
 * This function shows runtime polymorphism. In this case, all objects are of type Number.
 * Number has a toDouble() method which is different for each kind of number. However, since
 * all classes that extend Number must implement toDouble(), we can trust that longs, ints, floats,
 * etc can all make the conversion to a double when needed.
 */
fun sum(numbers : List<Number>) : Number {
    return numbers.sumByDouble { it.toDouble() }
}

fun main(args : Array<String>){
    val a : Number = 99
    val b = 1
    val c = 3.1

    //Using compile time polymorphism
    printNumber(a)
    printNumber(b)
    printNumber(c)

    //Using runtime polymorphism
    println("Summing all numbers")
    println(sum(listOf(a, b, c)))
}

Kotlin Inheritance

Inheritance is a core part of Object Orientated programming that provides a powerful code reuse tool by grouping properties and behaviors into common classes (called base classes) and having unique properties and behaviors placed in specific classes that grow out from the base class (called child classes). The child class receives all properties and behavior from its parent class, but it also contains properties and behaviors that are unique to itself. Inheritance allows for specialization of software components when a component has a specific need, but also allows for generalization when using items common to the parent.

Let’s consider an example that is more specific. Suppose we have a Vehicle class that models some sort of vehicle that we can drive. We start by creating a class.

open class Vehicle(
        private val make : String,
        private val model : String,
        private val year : Int) {

    fun start() = println("Starting up the motor")

    fun stop() = println("Turning off the engine")

    fun park() = println("Parking " + toString())

    open fun drive() = println("Driving " + toString())

    open fun reverse() = println("Reversing " + toString())

    override fun toString(): String {
        return "${year} ${make} ${model}"
    }
}

We know that all Vehicles have a make, model, and year. Users of a Vehicle can start and turn off the motor. They can also park, drive, or put the Vehicle in reverse. So far so good. However, later on, we need a vehicle that can tow a camper.

We could add a tow() method to Vehicle, but would that really make sense? What if our Vehicle is a Fiat? Would we really tow a camper with a Fiat? What we need is a Truck. Thanks to Inheritance, we don’t need to write Truck from scratch. We can simply create a specialized version of a Vehicle instead.

open class Truck(make: String,
            model: String,
            year: Int,
            private val towCapacity : Int) : Vehicle(make, model, year) {
    fun tow () = println("${toString()} is towing ${this.towCapacity} lbs")
}

This code creates a Truck class based off of Vehicle. As such, the Truck still has a make, model, and year. It can also start(), stop(), drive(), park(), and reverse() just like any other vehicle. However, it can also tow things and has a towCapacity property. What we have essentially done is reused all of the code from Vehicle and just changed what was needed so that we have a new Vehicle like object that also tows things.

Of course, later on, our needs change again and we decide to go camping in the mountains. It may snow in the mountains, so in addition to being able to tow things, we may want four wheel drive. Once again, not all Trucks have four wheel drive, so we don’t want to add a four wheel drive into Truck. As a matter of fact, four wheel drive isn’t even a specific behavior. What it really does is it enhances the already existing behaviors drive and reverse.

Let’s create another child class, based off of Truck, and specialize the driving behavior.

class FourWheelDrive(make: String, model: String, year: Int, towCapacity: Int) :
        Truck(make, model, year, towCapacity) {

    var fourByFour = false

    override fun drive() {
        if(fourByFour){
            println("Driving ${toString()} in four wheel drive")
        } else {
            super.drive()
        }
    }

    override fun reverse() {
        if(fourByFour){
            println("Reversing ${toString()} in four wheel drive")
        } else {
            super.drive()
        }
    }
}

In this class, we are modifing the already existant behaviors of driving and going in reverse. If the truck has four wheel drived turned on, the output of the program reflects this fact. One the other hand, if four wheel drive is turned off, the methods call super.drive(), which means use the behavior defined in Truck (which bubbles up to the original behavior in Vehicle). Thus, FourWheelDrive has specialized behaviors that were originally found in Vehicle. This is known as overriding behaviors.

Now let’s do a demonstration

fun main(args : Array<String>){
    val car = Vehicle("Fiat", "500", 2012)
    val truck = Truck("Chevy", "Silverado", 2017, 8000)
    val fourWheelDrive = FourWheelDrive("Dodge", "Ram", 2017, 8000)

    //drive() comes from Vehicle
    car.drive()
    println()

    //There is no drive() in Truck, but it 
    //inherited the behavior from Vehicle
    truck.drive()
    println()

    //FourWheelDrive override drive() to customize it
    fourWheelDrive.drive()
    println()

    println("Turn on four wheel drive")
    fourWheelDrive.fourByFour = true
    fourWheelDrive.drive()
}

Output

Driving 2012 Fiat 500

Driving 2017 Chevy Silverado

Driving 2017 Dodge Ram

Turn on four wheel drive
Driving 2017 Dodge Ram in four wheel drive

Three vehicles are made at the beginning of main: car, truck, and fourWheelDrive. They are all of type Vehicle, but Truck is a specialized case of Vehicle and fourWheelDrive is a specialized case of Truck. As such, all three objects have a drive() method which we use in the example program. When fourWheelDrive turns on fourByFour and then invokes drive, the console prints out that it is driving in four wheel drive.

Sealed Class

Although Kotlin supports inheritance, it’s use is discouraged. In order to use inheritance in Kotlin, the ‘open’ keyword needs to be added in front of the ‘class’ keyword first. If the ‘open’ keyword is ommitted, the class is considered to be final and the compiler will not allow the class to be used as a parent class. Likewise, all functions in a class also have to be marked as ‘open’ (see drive and reverse in vehicle), otherwise, overriding behaviors is not permitted.

Why would Kotlin choose to do this while other languages encourage inheritence? After studying issues found with inheritance, it became clear that many developers wrote classes without considering that a class may be extended later on. When changes where made in the parent class, the child classes could potentially break as well. This ended up creating a situation called “fragile base classes”.

Kotlin designers decided that by making developers mark classes as open, it would encourage developers to think about the needs of child classes when working on a base class. Kotlin also has powerful delegation mechanisms that encourage developers to use composition and delegation as code reuse mechanisms rather than inheritance.

Example program

Here is the entire source code used in this post

//Has to be marked as open to allow inheritance
open class Vehicle(
        private val make : String,
        private val model : String,
        private val year : Int) {

    fun start() = println("Starting up the motor")

    fun stop() = println("Turning off the engine")

    fun park() = println("Parking " + toString())

    //Has to be marked as open to allow overriding
    open fun drive() = println("Driving " + toString())

    //Has to be marked as open to allow overriding
    open fun reverse() = println("Reversing " + toString())

    override fun toString(): String {
        return "${year} ${make} ${model}"
    }
}

//Has to be marked as open for inheritance
open class Truck(make: String,
            model: String,
            year: Int,
            private val towCapacity : Int) : Vehicle(make, model, year) {
    fun tow () = println("${toString()} is towing ${this.towCapacity} lbs")
}

//This class is not open and therefore cannot be inherited from
class FourWheelDrive(make: String, model: String, year: Int, towCapacity: Int) :
        Truck(make, model, year, towCapacity) {

    var fourByFour = false

    //The override keyword signals to the compiler that we are overriding
    //the drive() method
    override fun drive() {
        if(fourByFour){
            println("Driving ${toString()} in four wheel drive")
        } else {
            super.drive()
        }
    }

    //The override keyword signals to the compiler that we are overriding
    //the reverse() method
    override fun reverse() {
        if(fourByFour){
            println("Reversing ${toString()} in four wheel drive")
        } else {
            super.drive()
        }
    }
}

fun main(args : Array<String>){
    val car = Vehicle("Fiat", "500", 2012)
    val truck = Truck("Chevy", "Silverado", 2017, 8000)
    val fourWheelDrive = FourWheelDrive("Dodge", "Ram", 2017, 8000)

    //drive() comes from Vehicle
    car.drive()
    println()

    //There is no drive() in Truck, but it
    //inherited the behavior from Vehicle
    truck.drive()
    println()

    //FourWheelDrive override drive() to customize it
    fourWheelDrive.drive()
    println()

    println("Turn on four wheel drive")
    fourWheelDrive.fourByFour = true
    fourWheelDrive.drive()
}

Kotlin Encapsulation and Procedural Programming

Software developers use the term Encapsulation to refer to grouping related data and behavior into a single unit, usually called a class. The class can be seen as the polar opposite of procedural based programming where data and behavior are treated as two distinct concerns. It should be noted that OOP and procedural programming have their distinct advantages and one should not be thought of as better as the other. Kotlin supports both styles of programming and it’s not uncommon to see a mix of both procedural and OOP programming.

Procedural Programming Example

Let’s with an example of procedural programming. In this example, we are working with a rectangle object. Its data is stored in a map (a data structure that supports key-value pairs) and then we have functions that consume the data.

val rectangle = mutableMapOf("Width" to 10, "Height" to 10, "Color" to "Red")

fun calcArea(shape : Map<String, Any>) : Int {
    return shape["Height"] as Int * shape["Width"] as Int
}

fun toString(shape : Map<String, Any>) : String {
    return "Width = ${shape["Width"]}, Height = ${shape["Height"]}, Color = ${shape["Color"]}, Area = ${calcArea(shape)}"
}

So we begin with the rectangle object that holds some properties of our rectangle: width, height, and color. Two functions follow the creation of the rectangle. They are calcArea and toString. Notice that these are global functions that accept any Map. This is dangerous because we can’t guarantee that the map will have “Width”, “Height”, or “Color” keys. Another issue is our loss of type safety. Since we need to store both Integers and Strings in the rectangle map, our value has to be of type Any, which is the base type in Kotlin.

OOP

Here is the same problem solved with an OOP approach.

class Rectangle(
    var width : Int,
    var height : Int,
    var color : String){

    fun calcArea() = this.width * this.height

    override fun toString() =
            "Width = ${this.width}, Height = ${this.height}, Color = ${this.color}, Area = ${calcArea()}"
}

The OOP solution demonstrates encapsulation because the data and the behavior associated with the data are grouped into a single entity called a class. The data associated with a class are often referred to as “properties” while the behaviors defined in the class are usually called “methods”. The calcArea() and toString() methods are always guaranteed to work because all objects based on the Rectangle class always have width, height, and color. We also do not lose our type safety because we are free to declare each property as a distinct variable within the class along with it’s type.

When the calcArea() and toString() methods are used, the word ‘this’ refers to the object that is calling these methods. You will notice that unlike the procedural program above, there is no Rectangle parameter supplied to calcArea() or toString(). Instead, the ‘this’ keyword is updated to refer to the object that is currently in use.

Tips on how to choose between Procedural and OOP

It should be noted that many software projects mix procedural and OOP programming. It’s also worth mentioning that almost anything that can be done with OOP can most likely be accomplished with procedural program and vice versa. However, some problems can be more easily solved when using procedural rather than OOP and other problems are better solved with OOP.

Procedural

  • Pure functional program: When we work in terms of pure mathematical functions where a function accepts certain inputs and returns certain outputs without side effects
  • Multi-threading: Procedural programming can help solve many challenges found in multi-threading environments. The integrity of mutable data is always concern in multi-threading, so functional programming works well provided the functions are pure functions that do not change data
  • Input and Output: In many cases, using a class to persist or retrieve an object from a data store is overkill. The same is true for printing to standard IO. Java has been heavily criticized for using the System.out.println() to write text to the console. Kotlin simplified this to println()

OOP and Encapsulation

  • GUI Toolkits: Objects representing buttons, windows, web pages, etc are very well modeled as classes
  • Grouping state or behavior: We often find that entities in software have properties or methods that are commonly held by other similar entities. For example, all road vehicles have wheels and move. Trucks are a specialized vehicle that has a box. Four-wheel drive trucks are specialized trucks that have four-wheel drive. We can use OOP to group all of the items common to all vehicles in a Vehicle class. All items common to Trucks can go in a Truck class, and finally, all items used only in four-wheel drive trucks can go in FourByFourTruck
  • Modularization: OOP allows developers to modularize code into smaller and reusable software components. Since the units of code are small pieces in a system, the code is usually easier to maintain.

Putting it together

Below is a working program that demonstrates both procedural programming and OOP.

package ch1

/**
 * This is a shape object without OOP. Notice how the data is separated from the behavior that works on the
 * data. The data is stored in a Map object, which uses key-value pairs. Then we have separate functions that
 * manipulate the data.
 */
val rectangle = mutableMapOf("Width" to 10, "Height" to 10, "Color" to "Red")

fun calcArea(shape : Map<String, Any>) : Int {
    //How can we guarantee that this map object has "Height" and "Width" property?
    return shape["Height"] as Int * shape["Width"] as Int
}

fun toString(shape : Map<String, Any>) : String {
    return "Width = ${shape["Width"]}, Height = ${shape["Height"]}, Color = ${shape["Color"]}, Area = ${calcArea(shape)}"
}

/**
 * This is a class that represents a Rectangle. You will immediately notice it has less code that the
 * non-OOP implementation. That's because the state (width, height, and color) are grouped together with
 * the behavior. Kotlin takes this a step further by providing us with behavior that lets us change
 * width, height, and color. We only need to add calcArea(), which we can guarantee will always work because
 * we know that there will always be width and height. Likewise, we know our toString() will never fail us
 * for the same reason!
 */
class Rectangle(
    var width : Int,
    var height : Int,
    var color : String){

    fun calcArea() = this.width * this.height

    override fun toString() =
            "Width = ${this.width}, Height = ${this.height}, Color = ${this.color}, Area = ${calcArea()}"
}

fun main(args : Array<String>){
    println("Using procedural programming")
    println(toString(rectangle))

    println("Changing width")
    rectangle["Width"] = 15
    println(toString(rectangle))

    println("Changing height")
    rectangle["Height"] = 80
    println(toString(rectangle))

    println("Changing color")
    rectangle["Color"] = "Blue"
    println(toString(rectangle))

    println("\n*****************************\n")
    println("Now using OOP")

    val square = Rectangle(10, 10, "Red")
    println(square)

    println("Changing height")
    square.height = 90
    println(square)

    println("Changing width")
    square.width = 40
    println(square)

    println("Changing color")
    square.color = "Blue"
    println(square)
}

Here is the output

Using procedural programming
Width = 10, Height = 10, Color = Red, Area = 100
Changing width
Width = 15, Height = 10, Color = Red, Area = 150
Changing height
Width = 15, Height = 80, Color = Red, Area = 1200
Changing color
Width = 15, Height = 80, Color = Blue, Area = 1200

*****************************

Now using OOP
Width = 10, Height = 10, Color = Red, Area = 100
Changing height
Width = 10, Height = 90, Color = Red, Area = 900
Changing width
Width = 40, Height = 90, Color = Red, Area = 3600
Changing color
Width = 40, Height = 90, Color = Blue, Area = 3600

OOP Abstraction

Abstraction is one of the major components of OOP. When we abstract, we are hiding the internal working details of something from its user. The user only cares about the controls that operate an object, but how the object acts on the controls are of no concern to the user.

A common everyday abstraction that people use daily can be found in a smart phone’s operating system. When a user wishes to make a phone call, they do not worry about how the phone makes a call. All the user cares about is using the keypad to dail a phone number and then pressing the call button. The details of connecting to the cell phone tower and then routing the phone call through the phone network are of no concern to the user. Those details have been abstracted.

Kotlin provides a variety of ways to provide abstraction. In the example below, I used the interface feature to model a Vehicle

interface Vehicle {
    fun park()

    fun drive()

    fun reverse()

    fun start()

    fun shutDown()
}

This code defines an abstraction point for all Vehicles. It guarantees that all classes that implement Vehicle have the following behaviors: park, drive, reverse, start, and shutDown. However, what we do not have is details as to how the Vehicle drives, parks, etc. As a matter of fact, the function bodies of all of the methods inside of vehicle are left empty (they are called abstract methods).

We may wish to take our vehicle for a drive. When we drive our vehicle, we are only really concerned with what the vehicle can do. We don’t care how it parks or goes in reverse. Let’s see this example in terms of code.

fun takeForDrive(v : Vehicle){
    with(v){
        //How we start is abstracted. We only care that the vehicle starts, but
        //we don't care about how it starts.
        start()

        //Likewise, we only care that it goes in reverse(). How it goes in reverse
        //is irrelevant here.
        reverse()

        //And so on...
        drive()
        park()
        shutDown()
    }
}

Notice how the takeForDrive function calls all five of our behaviors on the supplied Vehicle object. It doesn’t even know what kind of a vehicle it is driving. The Vehicle could be a car, Truck, airplane, boat, etc. None of that matters to the takeForDrive function. The details are hidden behind the Vehicle interface (in other words, abstracted).

One of the reasons abstraction is so important is that it promotes code reusability and maintainability. For example, now that we have this takeForDrive function, we can use any object that implements Vehicle. So for example, we can create a Truck class that implements Vehicle.

class Truck : Vehicle {
    override fun park() = println("Truck is parking")

    override fun drive() = println("Truck is driving")

    override fun reverse() = println("Truck is in reverse")

    override fun start() = println("Truck is starting")

    override fun shutDown() = println("Truck is shutting down")
}

and now we can take the Truck for a drive.

val truck = Truck()
takeForDrive(truck)

The price of gas may spike later one and we may choose to drive something that is more efficient. As long as our new mode of transportation implements the Vehicle interface, we can take it for a drive. Here is a car class that impelements Vehicle.

class Car : Vehicle{
    override fun park() = println("Car is parking")

    override fun drive() = println("Car is driving")

    override fun reverse() = println("Car is in reverse")

    override fun start() = println("Car is starting")

    override fun shutDown() = println("Car is shutting down")
}

Just like with truck, we can drive the car.

val car = Car()
takeForDrive(car)

Since Vehicle provides an abstraction point, any code that accepts Vehicle as a parameter can use Truck or Car. The function takeForDrive can be said to be loosely coupled to Truck and Car because it indirectly accepts Trucks or Cars using the Vehicle interface. This makes the takeForDrive function highly reusable to other components that may need to get developed in the future.

Example Program

Here is a fully working Kotlin program that ties everything together.

package ch1

//This defines our public interface for all vehicles
interface Vehicle {
    fun park()

    fun drive()

    fun reverse()

    fun start()

    fun shutDown()
}

//Our Truck class provides an implementation of Vehicle
class Truck : Vehicle {
    override fun park() = println("Truck is parking")

    override fun drive() = println("Truck is driving")

    override fun reverse() = println("Truck is in reverse")

    override fun start() = println("Truck is starting")

    override fun shutDown() = println("Truck is shutting down")
}

//Car provides an alternative implementation of Vehicle
class Car : Vehicle{
    override fun park() = println("Car is parking")

    override fun drive() = println("Car is driving")

    override fun reverse() = println("Car is in reverse")

    override fun start() = println("Car is starting")

    override fun shutDown() = println("Car is shutting down")
}

/**
 * This function demonstrates Abstraction. Notice how it accepts a Vehicle object but
 * makes no distinction if it's a Truck or a Car. The details of how the vehicle parks,
 * drives, reverses, starts, or shuts down are abstracted from this function. In the end, we are
 * only concerned with what the Vehicle object does, not how it does it.
 */
fun takeForDrive(v : Vehicle){
    with(v){
        //How we start is abstracted. We only care that the vehicle starts, but
        //we don't care about how it starts.
        start()

        //Likewise, we only care that it goes in reverse(). How it goes in reverse
        //is irrelevant here.
        reverse()

        //And so on...
        drive()
        park()
        shutDown()
    }
}

fun main(args : Array<String>){
    //Create a new Truck and take it for a drive. It works because Truck
    //implements the Vehicle Interface which abstracts the truck's details from
    //the takeForDrive function
    takeForDrive(Truck())

    //Likewise, we can also take a car for a drive. The car class also implements
    //Vehicle so takeForDrive can also use cars.
    takeForDrive(Car())
}

When run, the program prints

Truck is starting
Truck is in reverse
Truck is driving
Truck is parking
Truck is shutting down
Car is starting
Car is in reverse
Car is driving
Car is parking
Car is shutting down

Kotlin and OOP

Like many JVM languages such as Java, Scala, Groovy, etc, Kotlin supports OOP (Object Orientated Programming). OOP allows developers to create reusable and self-contained software modules known as classes where data and behavior are grouped together and contained within the said class. Such packaging allows developers to think in terms of components when solving a software problem and can improve code reuse and maintainability.

There are often four terminologies that are discussed when explaining OOP. The first term is encapsulation. Encapsulation refers to combining a programs data with the behaviors that operate on the said data. This is different than procedural based programming that treats data and behavior as two seperate concerns. However, encapsulation goes further than just simply grouping behavior and data. It also means that we protect our data inside of the class by only allowing the class itself to use the data. Other users of the class may only work on class data through the class’s public interface.

This takes us into the next concept of OOP, Abstraction. A well designed and encapsulated class functions as a black box. We may use the class, but we may only use it through it’s public interface. The details of how the class works internally are taken away from or Abstracted, from the clients of the class. A car is commonly used as an example of abstraction. We can drive the car using the steering wheel and the foot pedals, but we do not get into the internals of the car and fire the fuel injection at the right time. The car takes care of the details of making it move. We only operate it through its public interface. The details of how a car works are abstracted from us.

OOP promotes code reuse through inheritance. The basic idea is that we can use one class as a template for a more specialized version of a class. For example, we may have a class that represents a Truck. As time went on, we realized that we needed a four wheel drive truck. Rather than writing an entirely new class, we simply create a four wheel drive truck from the truck class. The four wheel drive truck inherits all of the computer code from the truck class, and the developer only needs to focus on code that makes it a four wheel drive truck. Such code reuse not only saves on typing, but it also helps to reduce debugging since developers are free to leverage already tested computer code.

Related to inheritence is polymorphism. Polymorphism is a word that means many-forms. For developers, this means that one object may act as if it were another object. Take the truck example above as an example. Since a four wheel drive truck inherited from truck, the four wheel drive truck may be used whenever the computer code expects a truck. Polymorphism goes a set further in allowing the program to act different depending on the context in which certain portions of computer code are used.

Koltin is a full fleged OOP language (although it does support other programming styles also). The language brings all of the OOP concepts discussed above to the fore-front by allowing us to write classes, abstract their interfaces, extend classes, and even use them in different situations depending on context. Let’s begin by looking at a very basic example of how to write and create a class in Kotlin.

package ch1

class Circle(
        //Define data that gets associated with the class
        private val xPos : Int = 20,
        private val yPos : Int = 20,
        private val radius : Int = 10){

    //Define behavior that uses the data
    override fun toString() : String =
            "center = ($xPos, $yPos) and radius = $radius"
}

fun main(args: Array<String>){
    val c = Circle() //Create a new circle
    val d = Circle(10, 10, 20)
    
    println( c.toString() ) //Call the toString() function on c
    println( d.toString() ) //Call the toString() function on d
}

In the above program, we have a very basic example of a Kotlin class called Circle. The code inside of lines 3-12 tell the Kotlin compiler how to construct objects of Type Circle. The circle has three properties (data): xPos, yPos, and radius. It also has a function that uses the data: toString().

In the bottom half of the program, the main method creates two new circle objects (c and d). The circle c has the default values of 20, 20, and 10 for xPos, yPos, and radius because we used the no parenthesis constructor (). Lines 5-7 in the circle class tell the program to simply use 20, 20, and 10 as default values in this case. Circle d has different valeus for xPos, yPos, and radius because we supplied 10, 10, 20 to the constructor. Thus we have an example of polymorphism in this program because two different constructors were used depending on the program’s context.

When we print on lines 18 and 19, we get two different outputs. When we call c.toString(), we get the String “center = (20, 20) and radius = 10” printed to the console. Calling toString() on d results in “center = (10, 10) and radius = 20”. This works because both c and d are distinct objects in memory and each have there own values for xPos, yPos, and radius. The toString() function acts on each distinct object, and thus, the output of toString() reflects the state of each Circle object.

Python Split and Join file

The book Programming Python: Powerful Object-Oriented Programming has an example program that shows how to split and join files. Many utilities exist for such an operation but the program offers a good working example of how to read from and write to binary files in Python3. The code below is an adaptation from the book with my own comments added.

Code

import os


def split(source, dest_folder, write_size):
    # Make a destination folder if it doesn't exist yet
    if not os.path.exists(dest_folder):
        os.mkdir(dest_folder)
    else:
        # Otherwise clean out all files in the destination folder
        for file in os.listdir(dest_folder):
            os.remove(os.path.join(dest_folder, file))

    partnum = 0

    # Open the source file in binary mode
    input_file = open(source, 'rb')

    while True:
        # Read a portion of the input file
        chunk = input_file.read(write_size)

        # End the loop if we have hit EOF
        if not chunk:
            break

        # Increment partnum
        partnum += 1

        # Create a new file name
        filename = os.path.join(dest_folder, ('part%004' % partnum))

        # Create a destination file
        dest_file = open(filename, 'wb')

        # Write to this portion of the destination file
        dest_file.write(chunk)

        # Explicitly close 
        dest_file.close()
    
    # Explicitly close
    input_file.close()
    
    # Return the number of files created by the split
    return partnum


def join(source_dir, dest_file, read_size):
    # Create a new destination file
    output_file = open(dest_file, 'wb')
    
    # Get a list of the file parts
    parts = os.listdir(source_dir)
    
    # Sort them by name (remember that the order num is part of the file name)
    parts.sort()

    # Go through each portion one by one
    for file in parts:
        
        # Assemble the full path to the file
        path = os.path.join(source_dir, file)
        
        # Open the part
        input_file = open(path, 'rb')
        
        while True:
            # Read all bytes of the part
            bytes = input_file.read(read_size)
            
            # Break out of loop if we are at end of file
            if not bytes:
                break
                
            # Write the bytes to the output file
            output_file.write(bytes)
            
        # Close the input file
        input_file.close()
        
    # Close the output file
    output_file.close()

Explanation

split

The code snippet shows to sample functions that either split a file into parts or join those parts back together into one file. The split function begins by taking three parameters. The first parameter, source, is the file that we wish to split. The second parameter, dest_folder, is a folder that stores the output files created by the split operation. The final parameter, write_size, is the size of the file parts in bytes.

Split starts by checking if dest_folder exists or not. If the folder does not exist, we call os.mkdir to create a new folder on the file system. Otherwise, we obtain a list of all files in the folder by calling os.listdir and then remove all of them by calling os.remove. When calling os.remove, we use os.path.join to create a full path to the target file that’s getting deleted.

Once the destination folder has been prepared, the function continues by performing the actually split operation. A partnum variable is created on line 13 that tracks the number of file parts created by the split operation. The source file is opened on line 16 in binary mode. Binary mode is used in this case because we could be dealing with audio or video files and not just text files.

The split function enters an infinite loop on line 18. On line 20, we read a number of bytes, specified by write_size, from the source file and store them in the chunk variable. On line 23, we test if chunk actually recieved any bytes from the read operation. If chunk did not read any bytes, then we have hit end of file (EOF) and we break out of the loop. Otherwise, we increment partnum by one and begin to write the file part.

Line 30 creates the name and destination for the file part by using os.path.join, the dest_folder, and a string template that accepts the current part number. The destination file is created on line 33 with a call to open (also in binary mode) and then on line 36, we write chunk to the file. Line 39 has an explicit call to closing the file. While we normally wait for files to close in garabage collection, this function opens a lot of files so ideally we should close them in oder to make sure we don’t exceed the number of file handles the underlying OS allows. The function ends by closing the input_file and returning the number of part files created.

join

The join function does the reverse job of the split function. It begins by accepting a source_dir, a destination file, and the size of the part files. The output_file is created on line 50 (opened in binary mode) and then on line 53, we use os.listdir to get a list of all parts.

Since our part files contain a number that identifies the parts, we can store all parts in a list and call sort() on it. Then it’s just a matter of looping through all of the parts and assembling them into a single file. The for loop starts on line 59. On line 62, we use os.path.join to create a full path to the part file and then we can open the part file on line 65.

The program enters an infinite join loop on line 67. Inside of the while loop, we read a part of the input_file and return the bytes read. If bytes is empty, we have it end of file so we can test for this on line 72 and use break to end the while loop if we have hit end of file. Otherwise, we can write to the output file on line 76.

When we have finished reading our part file, we again close it explicitly on line 79. When all parts of have been read we close the output_file. The output_file contains the bytes of the original file that was split in the first places

Thoughts

The code contained in this post isn’t ideal for production but is instead meant to be a learning tool. In this code, we cover reading and writing to binary files and functions of the os module. There are areas we could improve this code. For example, split destroys the contents of the destination folder, but ideally, it should instead throw an exception back to the caller and let the caller delete all files in a folder instead.

We also don’t test if our input files are really files and if our folders are really folders. That is certainly an area for improvement. Another thing that could be improved upon is using an enumeration for the size of the file parts. Right now, write_size in split and read_size in join are specified in bytes, but that isn’t clear to clients of these functions.

References

Lutz, Mark. Programming Python. Beijing, OReilly, 2013.

Find Python Source Files in Home Directory

Truthfully, most users aren’t very interested in finding the largest and smallest Python source files in their home directory, but doing so does provide for an exercise in walking the file tree and using tools from the os module. The program in this post is a modified example taken from Programming Python: Powerful Object-Oriented Programming where the user’s home directory is scanned for all Python source files. The console outputs the two smallest files (in bytes) and the two largest files.

Code

import os
import pprint
from pathlib import Path

trace = False

# Get the user's home directory in a platform neutral fashion
dirname = str(Path.home())

# Store the results of all python files found
# in home directory
allsizes = []

# Walk the file tree
for (current_folder, sub_folders, files) in os.walk(dirname):
    if trace:
        print(current_folder)

    # Loop through all files in current_folder
    for filename in files:

        # Test if it's a python source file
        if filename.endswith('.py'):
            if trace:
                print('...', filename)

            # Assemble the full file python using os.path.join
            fullname = os.path.join(current_folder, filename)

            # Get the size of the file on disk
            fullsize = os.path.getsize(fullname)

            # Store the result
            allsizes.append((fullsize, fullname))

# Sort the files by size
allsizes.sort()

# Print the 2 smallest files
pprint.pprint(allsizes[:2])

# Print the 2 largest files
pprint.pprint(allsizes[-2:])

Sample Output

[(0,
  '/Users/stonesoup/.local/share/heroku/client/node_modules/node-gyp/gyp/pylib/gyp/generator/__init__.py'),
 (0,
  '/Users/stonesoup/.p2/pool/plugins/org.python.pydev.jython_5.4.0.201611281236/Lib/email/mime/__init__.py')]
[(219552,
  '/Users/stonesoup/.p2/pool/plugins/org.python.pydev.jython_5.4.0.201611281236/Lib/decimal.py'),
 (349239,
  '/Users/stonesoup/Library/Caches/PyCharmCE2017.1/python_stubs/348993582/numpy/random/mtrand.py')]

Explanation

The program starts with a trace flag that’s set to false. When set to True, the program will print detailed information about what is happening in the program. On line 8, we grab the user’s home directory using Path.home(). This is a platform nuetral way of finding a user’s home directory. Notice that we do have to cast this value to a String for our purposes. Finally we create an empty allsizes list that holds our results.

Starting on line 15, we use the os.walk function and pass in the user’s home directory. It’s a common pattern to combine os.walk with a for loop so that we can traverse an entire directory tree. Each iteration os.walk returns a tuple that contains the current_folder, sub_folders, and files in the current folder. We are interested in the files.

Starting on line 20, the program enters a nested for each loop that examines each file individually. On line 23, we test if the file ends with ‘.py’ to see if it’s a Python source file. Should the test return True, we continue by using os.path.join to assemble the full path to the file. The os.path.join function takes into account the underlying operating system’s path separator, so on Unix like systems, we get / while Windows systems get \ as a path separator. The file’s size is computed on line 31 using os.path.getsize. Once we have the size and the file path, we can add the result to allsizes for later use.

The program has finished scanning the user’s home folder once the program reaches line 37. At this point, we can sort our results from smallest to largest by using the sort() method on allsizes. Line 40 prints the two smallest files (using pretty print for better formatting) and line 43 prints the two largest files.

References

Lutz, Mark. Programming Python. Beijing, OReilly, 2013.

Python Multiprocessing Producer Consumer Pattern

Python3 has a multiprocessing module that provides an API that’s similar to the one found in the threading module. The main selling point behind multiprocessing over threading is that multiprocessing allows tasks to run in a truly concurrent fashion by spanning multiple CPU cores while threading is still limited by the global interpreter lock (GIL). The Process class found in multiprocessing works internally by spawning new processes and providing classes that allow for data sharing between processes.

Since multiprocessing uses processes rather than threads, child processes do not share their memory with the parent process. That means we have to rely on low-level objects such as pipes to allow the processes to communicate with each other. The multiprocessing module provides high level classes similar to the ones found in threading that allow for sharing data between processes. This example demonstrates the producer consumer pattern using processes and the Queue class sharing data.

Code

import time
import os
import random
from multiprocessing import Process, Queue, Lock


# Producer function that places data on the Queue
def producer(queue, lock, names):
    # Synchronize access to the console
    with lock:
        print('Starting producer => {}'.format(os.getpid()))
        
    # Place our names on the Queue
    for name in names:
        time.sleep(random.randint(0, 10))
        queue.put(name)

    # Synchronize access to the console
    with lock:
        print('Producer {} exiting...'.format(os.getpid()))


# The consumer function takes data off of the Queue
def consumer(queue, lock):
    # Synchronize access to the console
    with lock:
        print('Starting consumer => {}'.format(os.getpid()))
    
    # Run indefinitely
    while True:
        time.sleep(random.randint(0, 10))
        
        # If the queue is empty, queue.get() will block until the queue has data
        name = queue.get()

        # Synchronize access to the console
        with lock:
            print('{} got {}'.format(os.getpid(), name))


if __name__ == '__main__':
    
    # Some lists with our favorite characters
    names = [['Master Shake', 'Meatwad', 'Frylock', 'Carl'],
             ['Early', 'Rusty', 'Sheriff', 'Granny', 'Lil'],
             ['Rick', 'Morty', 'Jerry', 'Summer', 'Beth']]

    # Create the Queue object
    queue = Queue()
    
    # Create a lock object to synchronize resource access
    lock = Lock()

    producers = []
    consumers = []

    for n in names:
        # Create our producer processes by passing the producer function and it's arguments
        producers.append(Process(target=producer, args=(queue, lock, n)))

    # Create consumer processes
    for i in range(len(names) * 2):
        p = Process(target=consumer, args=(queue, lock))
        
        # This is critical! The consumer function has an infinite loop
        # Which means it will never exit unless we set daemon to true
        p.daemon = True
        consumers.append(p)

    # Start the producers and consumer
    # The Python VM will launch new independent processes for each Process object
    for p in producers:
        p.start()

    for c in consumers:
        c.start()

    # Like threading, we have a join() method that synchronizes our program
    for p in producers:
        p.join()

    print('Parent process exiting...')

Explanation

The program demonstrates the producer and consumer pattern. We have two functions that run in their own independent processes. The producer function places supplied names on the Queue. The consumer function monitors the Queue and removes names from it as they become available.

The producer function takes three objects: a Queue, a Lock, and a List of names. It start with acquiring a lock on the console. The console is still a shared resource so we need to make sure only one Process writes to the console at a time or they will write over the top of one another. After acquiring a lock on the console, the function prints out its process id (PID).

The producer function enters a for each loop on lines 14-16. It sleeps between 0-10 seconds on line 15 to simulate a delay in processing and then it places a name on the Queue on line 16. When the for each loop is complete, the function aquires another console lock and then notifies the user it is exiting. At this point, the process ends.

The consumer function runs in it’s own process as well. It takes the Queue and the Lock as it’s parameters and then acquires a lock on the console to notify the user it is starting. The consumer prints out it’s PID also. Next the consumer enters an infinte loop on lines 30-38. It similuates sleeping on line 31 and then makes a call the queue.get() on line 34. If the queue has data, the get() method returns that data immediately and the consumer prints the data on line 38. Otherwise, get() blocks execution until data is available.

Line 41 is the entry point to the programing, using the if __name__ == ‘__main__’ test. We begin on 44 by making a list of names. The Queue object is created on line 49 and the Lock() object is made on line 52. Then on lines 57-59, we enter a for-each loop and create our producer Process objects. We use the target parameter to point the Process at the producer function and then pass in a tuple for the arguments that the function is called with.

Creating the consumers processes has one extra that that isn’t needed when creating the Producers. Lines 62-68 creates the consumer processes, but on line 67, set the daemon property to True. This is needed because the consumer function uses and infinite loop and those processes will never terminate unless they are marked as daemon processes.

Once are processes are created, we start them by calling start() on each Process object (lines 72-76). Like threads, Processes also have a join() method that can be used to synchronize a program. Our consumer processes never return, so calling join() on them would cause the program to hang, but our producer processes do return so we use join() on line 80 to cause the parent process to wait for the producer processes to exit.

Resources

Lutz, Mark. Programming Python. Beijing, OReilly, 2013.

Programming Python: Powerful Object-Oriented Programming