Kotlin String Formatting

String formatting allows developers to define a common string template and then interchange values as needed.

String Templates

Kotlin has a feature known as String Templates that allow you specify a string and then reference variables as required.

var good = "good"
var great = "great"

var templateGood = "Bob is a $good chef"
var templateGreat = "Bob is a $great chef"

In the above example, the value of the variable good is inserted at $good in the templateGood string. Likewise, the value of the variable great is inserted at $great in the templateGreat string.

We are also free to do evaluations.

var plus = "Eagerly awaiting ${1 + 2}"

The 1 + 2 will add to 3 and the result will be “Eagerly awaiting 3”

String.format()

The String class has a format method that takes in a format string and then any number of arguments. The number of arguments must match the same number of format specifiers in the string or an exception will get raised.

To begin, the format string takes the form of “[flags][width][.precision]type” so for example.

val simple = "%d" //very basic
val medium = "Have a nice %s" //More complex
val advanced = "%-2s\t%s"

The first example, simple, only has the type, %d. The same is true with the second String, which only has %s. The final example, has a left-justified flag “-“, followed by the width of two characters (2), and its type, String.

Type Specifiers

Here is a list of the type specifiers and their meanings.

%b Boolean
%c Character
%d Signed Integer
%e Float in Scientific Notation
%f Float in Decimal Format
%g Float in Decimal or Scientific Notation, depending on value
%h Hashcode of the supplied argument
%n Line separator
%o Octal Integer
%s String
%t Date or Time
%x Hexadecimal Integer

Here is an example of how to use the String.format() method.

var formatTemplate = "%-2s\t%s"
println(formatTemplate.format("%b", "Boolean") //prints %b	Boolean

Putting it Together

Here is an example program that shows off String templates.

fun main(args : Array<String>){
    var good = "good"
    var great = "great"

    println("Using Kotlin String templates")
    var templateGood = "Bob is a $good chef"
    var templateGreat = "Bob is a $great chef"

    println(templateGood)
    println(templateGreat)

    var formatTemplate = "%-2s\t%s"
    var func = {pair : Pair<String, String> -> println(formatTemplate.format(pair.first, pair.second))}

    var table = arrayOf(
            "%b" to "Boolean",
            "%c" to "Character",
            "%d" to "Signed Integer",
            "%e" to "Float in scientific format",
            "%f" to "Float in decimal format",
            "%g" to "Float in either decimal or scientific notation based on value",
            "%h" to "Hashcode of argument",
            "%n" to "Line separator",
            "%o" to "Octal Integer",
            "%s" to "String",
            "%t" to "Date or Time",
            "%x" to "Hexadecimal Integer")

    println("\n%[flags][width][.precision]type")

    println("\nFormatting Symbols")
    table.forEach(func)
}

Output

Using Kotlin String templates
Bob is a good chef
Bob is a great chef

%[flags][width][.precision]type

Formatting Symbols
%b	Boolean
%c	Character
%d	Signed Integer
%e	Float in scientific format
%f	Float in decimal format
%g	Float in either decimal or scientific notation based on value
%h	Hashcode of argument
%n	Line separator
%o	Octal Integer
%s	String
%t	Date or Time
%x	Hexadecimal Integer
Advertisements

Kotlin String Splitting

Most programming tasks require string splitting. For example, CSV files often separate data based on the comma character, which requires developers to split each line based on the comma in order to extract data. Extracting domain names from a web address is another common use case for String splitting. For example, we might have the address https://stonesoupprogramming.com and we wish to separate the https:// portion of the string. We can split the string into a list where the first part contains http:// and the second index contains stonesoupprogramming.com.

In Kotlin, we use the split() method defined in the String class. It comes in two flavors. One flavor takes the character to split the string on, and the other flavor takes a Regex. Both versions of the split method return a list that contains all potions of the String.

Non-Regex Splitting

The first version of split() takes a varargs parameter of delimiters, an optional boolean argument to ignoreCase and an optional limit argument that restricts how many times the split happens.

val str = "I smell fear on you"
val parts = str.split(" ")
val partsTwo = str.split("I", "fear", "you")
val partsThree = str.split("I", true)
val partsFour = str.split(delimiters = " ", limit = 2)

All versions of split return a list. It’s worth keeping in mind that the returned list will not contain any of the delimiters passed to the delimiters argument in split(). Normally, that isn’t a problem. For example, would you really want the ‘,’ character for all fields in a CSV file?

Regex Version

Most programming languages treat regular expressions, REGEX, as a String. Doing so often leads to unexpected bugs. Consider Java’s String.split() method.

String myString = "Green. Eggs. Ham.";
String [] parts = myString.split(".");

You may think that parts holds {“Green”, “Eggs”, “Ham”}. It doesn’t. The period character is treated as a regex expression that matches to any character. It’s a very common mistake.

Thankfully, Kotlin treats regular expressions as its own type. When we want to use a Regex in Kotlin, we need to create a Regex object. The Kotlin String class has a toRegex() function that performs the conversion from String to Regex.

val str = "Green. Eggs. Ham"
val partsNonRegex = str.split(".") //No Regex. This will split on the period character
val partsRegex = str.split(".".toRegex()) //Now using REGEX matching

Putting it together

As always, we will conclude with an example program that demonstrates the topic. Many of my students are given assignments where they need to track the number of unique words in a String. We will use String splitting and maps to accomplish the goal.

fun main(args : Array<String>){
    val paragraph = """
        |I am Sam.
        |Sam I am.

        |That Sam-I-am!
        |That Sam-I-am!
        |I do not like
        |That Sam-I-am!

        |Do you like
        |Green eggs and ham?

        |I do not like them,
        |Sam-I-Am
        |I do not like
        |Green eggs and ham.
        """.trimMargin()

    //Remove all end line characters and then split the string on the space character
    val parts = paragraph.replace('\n', ' ').split(" ")
    
    //Create an empty mutable map
    val uniqueWords = mutableMapOf<String, Int>()
    
    //Populate the map
    parts.forEach( { it -> uniqueWords[it] = uniqueWords.getOrDefault(it, 0) + 1 })
    
    //Print each word with it's count value
    println(uniqueWords)
}

Here is the output when run.

{I=5, am=1, Sam.=1, Sam=1, am.=1, =3, That=3, Sam-I-am!=3, do=3, not=3, like=4, Do=1, you=1, Green=2, eggs=2, and=2, ham?=1, them,=1, Sam-I-Am=1, ham.=1}

Kotlin String Parsing

Kotlin is a strongly typed language, which means that variables of one type may not be used for another type. In many cases, strong typing is a benefit. For example, since the language only allows Int variables to hold int values, we have some protection (not complete) against injection attacks that attempt to insert a string of code into an int variable.

On a more practical manner, strong typing helps us prevent bugs. The compiler can check ahead of time if a variable has certain attributes or methods prior to runtime. We are also protected against unsafe implicit conversions of data, which is a common problem in weakly typed languages. However, strong typing has a drawback. We are forced to convert between data types.

In languages such as Java, this can be really troubling. Consider the following Java code snippet.

String three = "3";
int t = Integer.parseInt(three);

String four = String.valueOf(4);

Java’s approach to converting between Strings and Ints is clunky, to say the least. Since Java primitives are not objects, wrapper classes have to be used to facilitate conversion between types when casting won’t work. Using wrapper classes add verbosity to the code and mixes a procedural approach in with what could be an OOP approach.

Oddly, this is true of Java’s String class also. Strings are objects in Java, yet you will not find methods such as toInt() or toDouble() on the String class. Once again, you are forced to turn to a primitive type wrapper class methods such as Integer.parseInt(). Using such an approach feels unnatural, to say the least.

Kotlin Conversions

Kotlin treats all variables as objects. Furthermore, variables have conversion methods. So if when we want to convert a String to an Int in Kotlin, we call toInt() on the String. When we want to convert an Int to a String, we call toString(). The approach is much more OOP in design since we are calling the behavior on the object that we are using.

Here are a few code examples to demonstrate type conversions in Kotlin.

//convert "3" to 3
val three = "3".toInt()

//convert 3 to "3"
val threeStr = three.toString()

//convert "3.14159" to a double
val pi = "3.14159".toDouble()

It should be noted that while type conversions in Kotlin may be more simple than Java, they aren’t safer. If we attempt to call toInt() on a string that isn’t a number, we will get an exception.

val three = "Mr. Pickles".toInt() //Bad!!!

The runtime isn’t able to figure out what number is represented by the string “Mr. Pickles” as such, we will get a runtime exception. If we aren’t sure if a conversion is safe, we will need to wrap the cast in a try-catch block or turn to a third party library such as StringUtils in Apache Commons.

Putting it Together

Below is an example program that demonstrates String parsing and type conversions.

fun main(args : Array<String>){
    val three = 3
    val pi = 3.14159
    val t = true

    //Note: We are using toString() explicitly here for demonstration
    //purposes.
    println("Converting 3 to String => " + three.toString())
    println("Converting pi to String => " + pi.toString())
    println("Converting t to String => " + t.toString())

    val threeNum = "3".toInt()
    val piNum = "3.14159".toDouble()
    val tBool = "true".toBoolean()

    println("threeNum => " + threeNum)
    println("piNum => " + piNum)
    println("tBool => " + tBool)
}

Output

Converting 3 to String => 3
Converting pi to String => 3.14159
Converting t to String => true
threeNum => 3
piNum => 3.14159
tBool => true

Kotlin String Methods

The Kotlin String class has an indexOf() method that allows developers to the position of a character or set of characters within a string. Such an operation is especially useful in situations where you may need to break a string into a substring or divide a string into different parts. Let’s go over the indexOf() method with a few examples.

indexOf

indexOf(Char)

The indexOf(Char) method is used to search for a single character within a string. It will return the index where the character was first found or -1 if the string doesn’t contain such a character.

val paragraph = 
    "I am Sam.\n" +
    "Sam I am.\n" +

    "That Sam-I-am!\n" +
    "That Sam-I-am!\n" +
    "I do not like\n" +
    "That Sam-I-am!\n" +

    "Do you like\n" +
    "Green eggs and ham?\n" +

    "I do not like them,\n" +
    "Sam-I-Am\n" +
    "I do not like\n" +
    "Green eggs and ham.\n"

//Index of letter a => 2
println("Index of letter a => " + paragraph.indexOf('a'))

The letter ‘a’ is the 3rd character in the example string. Since computers count starting at 0, the result is 2. This method also has an optional argument that allows it to ignore case.

indexOf(String)

If we want to find where a certain substring begins, we use the indexOf(String) method. It works just like its Char counterpart.

//Index of 'Green eggs and ham' => 91
println("Index of 'Green eggs and ham' => " + paragraph.indexOf("Green eggs and ham"))

The substring “Green eggs and ham” is found at position 91. Once again, this method returns -1 if the search string isn’t found. We can also use the ignoreCase optional argument when we do not care about case sensitivity.

indexOf(Char, Int), indexOf(String, Int)

The indexOf method has an optional startIndex parameter that takes an int value. By default, startIndex is 0, which is why indexOf starts at the beginning of the string. If we want to start looking in the middle of the string, we need to pass a value to startIndex. Let’s look at an example of where we can find all occurrences of the letter ‘I’.

var fromIndex = 0
while(paragraph.indexOf('I', fromIndex) > -1){
    fromIndex = paragraph.indexOf("I", fromIndex)
    println("Found at => " + fromIndex)
    fromIndex++
}

Output

Found at => 0
Found at => 14
Found at => 29
Found at => 44
Found at => 50
Found at => 73
Found at => 111
Found at => 135
Found at => 140

The code tracks each index with a fromIndex variable. We enter into a while loop that continues until indexOf returns -1. With each iteration of the loop, we update fromIndex using indexOf() and pass in the old value of fromIndex. That causes the search to keep moving forward. After we print the index, we need to increment fromIndex by 1 because indexOf is inclusive. Should we fail to increment fromIndex, we will enter into an infinite loop because indexOf will continue to return the same value.

lastIndexOf

Strings also have a lastIndexOf() method that is a cousin to the indexOf() method. It takes the same arguments as indexOf(), but rather than returning the first occurence of the search character or string, it returns the last occurence instead.

//Last index of 'eggs' => 160
println("Last index of 'eggs' => " + paragraph.lastIndexOf("eggs"))

startsWith(), endsWith()

The startsWith() and endsWith() methods are convience methods that are used to check if a string starts or ends with a supplied prefixString. It also has an optional offset parameter that allows for searching in the middle of the string.

//paragraph starts with 'I am Sam' => true
println("paragraph starts with 'I am Sam' => " + paragraph.startsWith("I am Sam"))
//paragraph ends with 'Green eggs and ham. => true
println("paragraph ends with 'Green eggs and ham. => " + paragraph.endsWith("Green eggs and ham.\n"))

Putting it Together

Here is an example program followed by the output.

fun main(args : Array<String>){
    val paragraph =
        "I am Sam.\n" +
        "Sam I am.\n" +

        "That Sam-I-am!\n" +
        "That Sam-I-am!\n" +
        "I do not like\n" +
        "That Sam-I-am!\n" +

        "Do you like\n" +
        "Green eggs and ham?\n" +

        "I do not like them,\n" +
        "Sam-I-Am\n" +
        "I do not like\n" +
        "Green eggs and ham.\n"


    println("Index of letter a => " + paragraph.indexOf('a'))
    println("Index of 'Green eggs and ham' => " + paragraph.indexOf("Green eggs and ham"))
    println("Finding all occurrences of 'I'...")
    var fromIndex = 0
    while(paragraph.indexOf('I', fromIndex) > -1){
        fromIndex = paragraph.indexOf("I", fromIndex)
        println("Found at => " + fromIndex)
        fromIndex++
    }
    println("Last index of 'eggs' => " + paragraph.lastIndexOf("eggs"))
    println("paragraph starts with 'I am Sam' => " + paragraph.startsWith("I am Sam"))
    println("paragraph ends with 'Green eggs and ham. => " + paragraph.endsWith("Green eggs and ham.\n"))
}

Output

Index of letter a => 2
Index of 'Green eggs and ham' => 91
Finding all occurrences of 'I'...
Found at => 0
Found at => 14
Found at => 29
Found at => 44
Found at => 50
Found at => 73
Found at => 111
Found at => 135
Found at => 140
Last index of 'eggs' => 160
paragraph starts with 'I am Sam' => true
paragraph ends with 'Green eggs and ham. => true

Parse Command Line Arguments in Python

Most command line programs allow users to pass command line arguments into a program. Consider, for example, the unix ls:

ls -a

Python programs can utilize such techniques by using sys.argv and performing string operations on each element in the argv list. Here is an example script from Programming Python: Powerful Object-Oriented Programming with my own comments added to the script. (Of course, the Python standard library provides much more sophisticated getopt and optparse modules which should be used in production programs)

def getopts(argv):
    # Create a dictionary to store command line options
    opts = {}

    # Iterate through each command line argument
    while argv:
        # Look for the dash symbol
        if argv[0][0] == '-':

            # Create an entry in the opts dictionary
            opts[argv[0]] = argv[1]

            # Slice off the last entry and continue
            argv = argv[2:]
        else:

            # Otherwise just slice of the last entry since there
            # was no command line switch
            argv = argv[1:]

    # Return the dictionary of the command line options
    return opts

# Only run this code if this is a standalone script
if __name__ == '__main__':
    # Put the argv option into our namespace
    from sys import argv

    # call our getopts function
    myargs = getopts(argv)

    # Print our output
    if '-i' in myargs:
        print(myargs['-i'])

    print(myargs)

Detailed Explanation

Our goal in this script is to look for any command line switches that are passed to the script. To start with, we need to import the argv object from the sys module which is a list that provides the command line options passed to the script. On line 30, we call our getopts function and pass the argv object to the function. The program’s execution jumps up to line 3.

Line 3 creates an empty dictionary object that will store all command line switches with their arguments. For example, we may end up with ‘-n’:”Bob Belcher” in this dictionary. Starting on line 6, we enter into a while loop that examines each entry in argv.

Line 8 looks at the first character in the first entry of argv (remember that Python String implement the [] operator). We are just checking if the character in question is a dash (‘-‘) character. If this character is a dash, we are going to create an entry into our opts dictionary on line 11. Then we slice off the next 2 entries in argv from the beginning of the list (one for the switch, the other one for the argument).

Line 15 is the alternative case. If argv[0][0] is something other than a dash, we are going to ignore it. We just slice off the first entry in argv (only one this time since there is no switch). Once argv is empty, we can return the opts dictionary to the caller.

The program execution resumes on line 33. For demonstration purposes, it checks if the myargs dictionary has a ‘-i’ key. If it does, it will print the value associated with ‘-i’. Then line 36 prints out the myarg dictionary.

When run from the console, the script will show output such as this.

Patricks-MacBook-Pro:system stonesoup$ python testargv2.py -h "Bob Belcher" -w "Linda Belcher" -d1 "Tina Belcher" -s "Gene Belcher" -d2 "Louise Belcher" 
{'-h': 'Bob Belcher', '-w': 'Linda Belcher', '-d1': 'Tina Belcher', '-s': 'Gene Belcher', '-d2': 'Louise Belcher'}

Kotlin Koans—Part 6

Insert Values into String

Kotlin upgrades some of Java’s String capabilities. One of the first things I liked was the ability to insert variables into the String

val a = 1
val b = 2
str = "Here is a String with values a=$a and b=$b)

We could of course have done this with String.format in Java

int a = 1;
int b = 2;
String str = "Here is a String with values a=%d and b=%d".format(a, b);

I think most people agree that the Kotlin approach is more concise.

Multiline Strings

Kotlin supports the “”” for mutliline strings without any need to escape charaters.

str = """
Here is a multiline String
C:\folder\file.txt

"""

The Kotlin Koans tutorial suggested that this was also useful for Regex expressions.

Task

The task for this portion of the tutorial was simple enough. We have a variable named month that we insert into a regex expression.

val month = "(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)"

fun task5(): String{
    return """\d{2}\ $month \d{4}"""
}

This a use case for the triple quote Strings. In Java, we would have needed to escape all of the backslashes in the regex expression. I know that I am not the first developer who has gotten burned by typing \ when I should have typed \\, so the triple quote String is a nice feature.

You can click here to see Part 5.

Kotlin Koans—Part 3

The last tutorial’s challange was to take a collection and assemble it into a string using StringBuilder. Java 8 finally gave developers a way to join a String, but Kotlin seems to make it even easier.

This partion of the Kotlin Koans tutorial has us using collection::joinToString. Using Kotlin, we can assemble an entire collection into a String using just one line of code.

fun task2(collection: Collection): String {
    return collection.joinToString(", ", "{", "}")
}

This code is functionally equivalent to what we did in part 2. I also learned a little bit more about the language. Kotlin let’s us have default parameters in our methods. I have to say, while I appreciate Java’s method overloading capabilities, there are times where it’s simplier to use default parameters.

You can click here to see Part 2 or Part 4