Kotlin Data Streams

Data streams are used to write binary data. The DataOutputStream writes binary data of primitive types, while DataInputStream reads data back from the binary stream and converts it to primitive types. Here is an example program that writes data to a file and then reads it back into memory.

import java.io.DataInputStream
import java.io.DataOutputStream
import java.io.FileInputStream
import java.io.FileOutputStream

fun main(args : Array<String>){
    val burgers = "data.burgers"

    //Open the file in binary mode
    DataOutputStream(FileOutputStream(burgers)).use { dos ->
        with(dos){
            //Notice we have to write our data types
            writeInt("Bob is Great\n".length) //Record length of the array
            writeChars("Bob is Great\n") //Write the array
            writeBoolean(true) //Write a boolean

            writeInt("How many burgers can Bob cook?\n".length) //Record length of array
            writeBytes("How many burgers can Bob cook?\n") //Write the array
            writeInt(Int.MAX_VALUE) //Write an int

            for (i in 0..5){
                writeByte(i) //Write a byte
                writeDouble(i.toDouble()) //Write a double
                writeFloat(i.toFloat()) //Write a float
                writeInt(i) //Write an int
                writeLong(i.toLong()) //Write a long
            }
        }
    }

    //Open a binary file in read mode. It has to be read in the same order
    //in which it was written
    DataInputStream(FileInputStream(burgers)).use {dis ->
        with (dis){
            val bobSize = readInt() //Read back the size of the array
            for (i in 0 until bobSize){
                print(readChar()) //Print the array one character at a time
            }
            println(readBoolean()) //Read a boolean

            val burgerSize = readInt() //Length of the next array
            for (i in 0 until burgerSize){
                print(readByte().toChar()) //Print array one character at a time
            }
            println(readInt()) //Read an int

            for (i in 0..5){
                println(readByte()) //Read a byte
                println(readDouble()) //Read a double
                println(readFloat()) //Read a float
                println(readInt()) //Read an int
                println(readLong()) //Read a long
            }
        }

    }
}

The program creates a FileOutputStream object and passes the name of the file to its constructor. The FileOutputStream object is then passed to the constructor of DataOutputStream. We apply the use() function to ensure all resources are freed properly when we have finished. The file is now open for writing in binary mode.

When we wish to use the same object repeatedly, we can pass it to the with() function. In our case, we intend to keep using our DataOutputStream object, so on line 11, we pass it to the with() function. Inside of the with() function, all method calls will target the dos object because it was supplied to with().

Since we intend to write a string to the file, we need to record the length of the string. We do this using the writeInt function and passing the length of our string to it. Then we can use writeChars() to write a character array to the file. The String argument is converted to a character array and written to the file. Finally, we call writeBoolean to write true/false values to the file.

The next section is a repeat of the first. We intend to write another string to the file, but do so, we need to record the length of the file. Once again, we turn to writeInt() to record an int value. The next line, we use writeBytes() rather than writeChars() to demonstrate how we can write a byte array rather than a String. The DataOutputStream class sees to the details of turning a String into a byte array. Finally, we write another int value to the stream.

Next, we enter a for loop on line 21. Inside of the for loop, we demonstrate writing different primitive types to the file. We can use writeByte() for a byte, writeDouble() for a double, and so on for each primitive type. The DataOutputStream class knows the size of each primitive type and writes the correct number of bytes for each primitive.

When we are done writing the object, we open it again to read it. Line 33 creates a FileInputStream object that accepts the path to the file in its constructor. The FileInputStream object is chained to DataInputStream by passing it to the constructor of DataInputStream. We apply the use() function to ensure all resources are properly closed.

Reading the file requires the file to be read in the same order in which it is written. Our first order of business is to grab the size of the character array we wrote to the file earlier. We use readInt() on line 35 followed by a for loop that terminates at the size of the array on line 36. Each iteration of the for loop calls readChar() and the String is printed to the console. When we are finished, we read a boolean on line 39.

Our next array was a byte array. Once again, we need it’s final size so we call readInt() on line 41. Lines 42-44 run through the array and call readByte() until the loop terminates. Each byte is converted to a character object using toChar(). On line 45, we read an int using readInt().

The final portion of the program repeats the for loop found earlier. In this case, we enter a for loop that terminates after five iterations (line 47). Inside of the for loop, we call readByte(), readDouble(), readFloat(), and so on. Each call prints the restored variable to the console.

Kotlin Scanner

The Scanner class is a powerful class that looks for tokens in an input stream and returns each match. The class is often used on files, but it can work with other strings, network sockets, or just about any other character input stream object. The following program demonstrates using a Scanner object to search for words without punctuation. It reads a file and then outputs the most frequently used words to the least frequently used words.

import java.io.FileReader
import java.util.*

fun main(args : Array<String>){
    //Check if they supplied a file
    if(args.isEmpty()){
        println("Please provide a file")
        System.exit(-1)
    }

    //Create an empty map
    val wordMap = mutableMapOf<String, Int>()

    //Open the file and pass it to a Scanner object.
    Scanner(FileReader(args[0])).use { sc ->

        //Tell the scanner to only match entire words
        sc.useDelimiter("""\W""".toPattern())

        //Loop until we get to the end of the file
        while(sc.hasNext()){

            //Grab the next word
            val word = sc.next()

            //Test that it's not a blank string
            if(word.isNotBlank()){
                //Add it to the word map
                wordMap[word] = wordMap.getOrDefault(word, 0) + 1
            }
        }
    }

    //This prints the entries by most used words to least used words
    wordMap.entries.sortedByDescending { it.value }.forEach({it -> println(it)})
}

The program starts by checking if the user provided command line arguments. If the args array is empty, the program exits after printing an error message. Line 12 creates an empty mutable map so that we can add words and counts to it. Individual words are used as the key while the Int is used for values to represent the number of times the word is found.

The file is opened on line 15. We create a FileReader object and pass the path of the file to its constructor. The file path is found at the first element in the arguments array and was supplied by the user. The FileReader object is passed to the constructor of the Scanner. We apply the use() function to ensure the Scanner and the underlying file is closed when have finished.

Line 18 tells the Scanner to match whole words by passing in a regex string and converting it to a Pattern object. The regex “\W” matches whole words. Kotlin allows use to use raw strings inside of triple quotes “”” so that we do not need to worry about escaping any characters.

Line 21 enters a while loop that terminates when Scanner.hasNext() is false. That means we loop until there are no more matches in the input stream. Line 27 tests if the word is a blank string and if it isn’t a blank string, we update the word count on line 29.

Line 35 prints each word from the most used to the least used. It’s accomplished by getting the entries list and then sorting it in descending order. The sortedByDescending takes a comparator object which is created by the lambda expression it.value. In this case, it.value represents the number of times a word was found. The final forEach() operation iterates through the sorted list of entries and prints them individually to the console.

Here was my output when I used this program with a brief excerpt from Green Eggs and Ham.

I=37
not=29
like=28
them=28
do=20
a=19
am=10
Sam=10
you=8
would=7
eggs=6
and=6
ham=6
Would=6
In=6
in=6
or=5
there=5
eat=5
Not=5
Here=4
house=4
mouse=4
with=4
You=4
That=3
Green=3
With=3
green=3
box=3
fox=3
car=3
Anywhere=2
here=2
anywhere=2
Eat=2
could=2
may=2
tree=2
Do=1
Could=1
they=1
are=1
will=1
see=1
let=1
me=1
be=1

Kotlin Buffered Text Files

The BufferedReader and BufferedWriter classes improve the performance of reading and writing operations by adding an in-memory buffer to the streams. By using a memory buffer, the program and reduce the number of calls required to the underlying read and write streams and thus improve performance. Here is an example program that makes use of both BufferedReader and BufferedWriter.

fun main(args : Array<String>){
    when (args.size){
        //Check for two command line arguments
        2 -> {
            //Grab source and destination files
            val src = args[0]
            val dest = args[1]

            //Check if the destination file exists. We can create it
            //if needed
            with (File(dest)){
                if(!exists()){
                    createNewFile()
                }
            }

            //Now, open the source file in read mode. The BufferedReader
            //provides buffering to improve performance
            BufferedReader(FileReader(src)).use { reader ->

                //Likewise, open the destination file in write mode
                //The BufferedWriter class provides buffering for performance
                BufferedWriter(FileWriter(dest)).use { writer ->

                    //Read through the source file one character at a time
                    var character = reader.read()
                    while(character != -1){

                        //Write the character to the destination file
                        writer.write(character)

                        //Read the next character.
                        character = reader.read()
                    }
                }
            }
        }
        else -> {
            println("Source file followed by destination file required")
            System.exit(-1)
        }
    }
}

The example program copies the source file to the destination file. We begin by using the when() function to check if we have two and only two command line arguments. If we have a source and destination file, the program continues starting on line 6 otherwise it jumps down to line 39 and exits after printing an error message.

On lines 6 and 7, we grab our source and destination files from the command line parameters. On line 11, we create a new File object and pass it to the with() function to see if we need to make a new file for the destination. Line 12 uses the exits() property to see if the file exists, and if it doesn’t exist, line 13 creates the new file.

Starting at line 19, we open the source file and begin our copy operation. The file is opened by creating a new FileReader object and passing in the name of the source file. The FileReader object is then passed to the constructor of BufferedReader. We utilize the use() function to ensure that all resources are properly closed when we are finished with the read operation. It’s also worth noting that we call the lambda parameter reader rather than it to improve code readability.

Line 23 opens the destination file for writing. We create a FileWriter (the companion object to FileReader) and pass the name of the destination file to the FileWriter’s constructor. The FileWriter object is passed to the BufferedWriter constructor to provide buffering support. Once again, we utilize the use() function to ensure that all resources are closed when finished.

The copy operation is fairly anti-climatic. We read the first character on line 26 and then enter into a while loop that terminates when character == -1. Inside of the while loop, we write the character to the destination file (line 30) and then read the next character (line 33). The use() function that was applied to both the BufferedReader and BufferedWriter objects closes the files when finished.

The program can be run by using the following commands at the command line.

kotlinc BufferedCopy.kt -include-runtime -d BufferedCopy.jar
ava -jar BufferedCopy.jar  [dest file]

When finished, the dest file will be an exact copy of source file.

Kotlin Reader Example

The java.io.Reader class provides a low-level API for reading character streams.

import java.io.FileReader

fun main(args : Array<String>){
    if (args.isEmpty()){
        println("Please provide a list of files separated by spaces")
        System.exit(-1)
    }

    //Read each supplied file
    args.forEach { fileName ->

        //Open the file. The use() extension function sees to the details
        //of closing the file when finished
        FileReader(fileName).use {

            //Read a single character
            var character = it.read()

            //read() returns -1 at End of File
            while (character != -1){

                //Print the character (make sure to convert it to a Character)
                print(character.toChar())

                //Read the next character
                character = it.read()
            }
        }
    }
}

The example program requires names of text files passed in as command line arguments so our first task is to check if we have any command line arguments. On line 4, we use the isEmpty() function on the args array object to check for an empty array. If true, we print a message to the user (line 5) and then exit the program (line 6).

Provided the program is still running, we begin by printing the contents of each file to the console. On line 10, we enter into a forEach statement to process each of the file supplied at the command line. Rather than using the standard it varaible name, we use fileName to help make the code more clear.

Line 14 performs the operation of actually opening the file. We do this by creating a new FileReader object and pass the name of the file into its constructor. Then we chain the object to the use() extension function. The use() function sees to the details of actually closing the file when we are finished with it, even in the case of an exception.

The file reader object is now referred to by the variable it. On line 17, we call it.read() to read a single character from the file and store it into the character variable. We then enter into a while loop that terminates when character is -1. The -1 value indicates we have reached the end of the file. Inside of the while loop, we print the character (line 23). Sicne read() returns an int, we have to call toChar() to print the actual character. Then on line 26, we update character to the next character in the stream.

Here is how I ran the program for those readers who wish to try it out.

kotlinc ReaderExample.kt -include-runtime -d readerExample.jar
java -jar readerExample.jar ReaderExample.kt

This invocation printed the example program to my console, but it works with any text file.

Kotlin Console Password

The Console object has a readPassword() method that disables echoing and allows a user to enter a password blindly. The password is returned as a character array. Here is an example of how to use the Console class to request a password from the user.

import java.io.Console

fun main(args : Array<String>){
    //Best to declare Console as a nullable type since System.console() may return null
    val console : Console? = System.console()

    when (console){
        //In this case, the JVM is not connected to the console so we need to exit
        null -> {
            println("Not connected to console. Exiting")
            System.exit(-1)
        }
        //Otherwise we can proceed normally
        else -> {
            val userName = console.readLine("Enter your user name => ")
            val pw = console.readPassword("Enter your password => ")

            println("Access Granted for $userName")

            //This is important! We don't know when the character array
            //will get garbage collected and we don't want it to sit around
            //in memory holding the password. We can't control when the array
            //gets garbage collected, but we can overwrite the password with
            //blank spaces so that it doesn't hold the password.
            for (i in 0 until pw.size){
                pw[i] = ' '
            }
        }
    }
}

We begin by getting a reference to the console. Since there is a possibility that System.console() can return null, it’s a best practice to declare console as a nullable type so that we have Kotlin compiler safety. After getting a reference to the console, we can use the when() function to react to the null case or continue with the program.

The call to readPassword (line 16) isn’t very dramatic. The program uses the overloaded version to prompt the user for a password and return the password as a character array. In terms of security, it’s important to never hold a password as a String. That’s why the readPassword() method returns a character array.

Strings are immutable. That means we can never change the value of the String. Furthermore, we never know when the JVM will garbage collect an object. Those two facts make for dangerous circumstances because a String holding a password can sit in memory for an unknown amount of time. Should someone break past the security constraints of the JVM and gain access to the program’s memory, there is the potential they could steal a password.

Character arrays also get garbage collected at unknown times. The difference between the character array and a String is that we can overwrite each element in the array. The example program demonstrates this on lines 25-27 by overwriting each element of the pw array with a blank space. Should someone break into the program’s memory, all they will see is a character array of empty spaces. The operation of converting a password character array to empty spaces should be completed as soon as password validation is finished.

Kotlin Console Object

The System.console() method provides an access point for the Console object, which is used to read input from a character input device like the keyboard and prints to character display device like the screen’s display. If the JVM is started indirectly or as a background process, the method will return null. The Console object is useful when creating CLI applications.

The class provides a way to both read character input and write character output. Keep in mind that character output may be less important than input. Kotlin already provides its own print() and println() functions for writing to the standard out. The console class does make it easy to read input from the keyboad since we do not need to work about working with System.in directly.

Let’s take a look at an example where we simply echo what’s inputted to the keyboard.

import java.io.Console

fun main(args : Array<String>){
    val console : Console? = System.console()
    when (console){
        null -> {
            println("Running from an IDE...")
        }
        else -> {
            while (true){
                //Read a line from the keyboard
                val line = console.readLine("What does Bob say? ")
                if (line == "q"){
                    return
                }
                console.printf("Bob says: %s\n", line)
            }
        }
    }
}

The program is really simple. We make a console variable, that is nullable. This step is critical because System.console() can return null. Our next operation is to check if the console object is null. When console is null, we exit the program. Otherwise, we enter a loop that continues until the user enters “q” to quit.

The console.readLine() method returns a line inputted from the keyboard. In other words, it will return everything typed until the user pressed the enter or return key. To print the output, we use printf() to print a formatted output.
We could have used Kotlin’s print() function also.

Common Console Methods

reader() : Reader

This method returns a Reader object reference that can be used for low-level read operations.

writer() : PrintWriter

Returns a PrintWriter object reference that can be used for low-level output operations.

readLine() : String

Returns a line of text inputed from the console. The overloaded version accepts a String that serves as a command prompt.

readPassword() : CharArray!

This method disablese echoing so that a user can type a password in private. The overloaded version takes a String that is used as a command prompt.

format(fmt : String, varArgs args : Any) : Console

This method writes the format string an it’s argments to the output.

printf(fmt : String, varArgs args : Any) : Console

Does same thing as format.

flush() : void

This method flushes the buffer and forces all output to be written immediately.

References

See https://docs.oracle.com/javase/8/docs/api/java/io/Console.html

Kotlin Command Line Compile

Once in a great while, it’s necessary to compile and run a Kotlin program from the terminal. Here is how to do it.

Mac

Install homebrew first. Then open the Mac terminal and type the following commands.

brew update
brew install kotlin

This will install the necessary command-line applications to compile a Kotlin program.

Next, open your terminal and navigate to the folder that holds your Kotlin file. Kotlin source files have the extension .kt. Assuming your file is belcher.kt, you would type

kotlinc belcher.kt -include-runtime -d belcher.jar

to compile the source file. You will get a jar file in the same folder. The program can be run with

java -jar belcher.jar

Windows (Cygwin) or WSL

Open your Cygwin or WSL terminal and type

curl -s https://get.skdman.io | bash
sdk install kotlin

This will install the necessary command-line applications to compile a Kotlin program.

Next, open your terminal and navigate to the folder that holds your Kotlin file. Kotlin source files have the extension .kt. Assuming your file is belcher.kt, you would type

kotlinc belcher.kt -include-runtime -d belcher.jar

to compile the source file. You will get a jar file in the same folder. The program can be run with

java -jar belcher.jar

More Information

More information can be found at http://kotlinlang.org/docs/tutorials/command-line.html

Kotlin String Formatting

String formatting allows developers to define a common string template and then interchange values as needed.

String Templates

Kotlin has a feature known as String Templates that allow you specify a string and then reference variables as required.

var good = "good"
var great = "great"

var templateGood = "Bob is a $good chef"
var templateGreat = "Bob is a $great chef"

In the above example, the value of the variable good is inserted at $good in the templateGood string. Likewise, the value of the variable great is inserted at $great in the templateGreat string.

We are also free to do evaluations.

var plus = "Eagerly awaiting ${1 + 2}"

The 1 + 2 will add to 3 and the result will be “Eagerly awaiting 3”

String.format()

The String class has a format method that takes in a format string and then any number of arguments. The number of arguments must match the same number of format specifiers in the string or an exception will get raised.

To begin, the format string takes the form of “[flags][width][.precision]type” so for example.

val simple = "%d" //very basic
val medium = "Have a nice %s" //More complex
val advanced = "%-2s\t%s"

The first example, simple, only has the type, %d. The same is true with the second String, which only has %s. The final example, has a left-justified flag “-“, followed by the width of two characters (2), and its type, String.

Type Specifiers

Here is a list of the type specifiers and their meanings.

%b Boolean
%c Character
%d Signed Integer
%e Float in Scientific Notation
%f Float in Decimal Format
%g Float in Decimal or Scientific Notation, depending on value
%h Hashcode of the supplied argument
%n Line separator
%o Octal Integer
%s String
%t Date or Time
%x Hexadecimal Integer

Here is an example of how to use the String.format() method.

var formatTemplate = "%-2s\t%s"
println(formatTemplate.format("%b", "Boolean") //prints %b	Boolean

Putting it Together

Here is an example program that shows off String templates.

fun main(args : Array<String>){
    var good = "good"
    var great = "great"

    println("Using Kotlin String templates")
    var templateGood = "Bob is a $good chef"
    var templateGreat = "Bob is a $great chef"

    println(templateGood)
    println(templateGreat)

    var formatTemplate = "%-2s\t%s"
    var func = {pair : Pair<String, String> -> println(formatTemplate.format(pair.first, pair.second))}

    var table = arrayOf(
            "%b" to "Boolean",
            "%c" to "Character",
            "%d" to "Signed Integer",
            "%e" to "Float in scientific format",
            "%f" to "Float in decimal format",
            "%g" to "Float in either decimal or scientific notation based on value",
            "%h" to "Hashcode of argument",
            "%n" to "Line separator",
            "%o" to "Octal Integer",
            "%s" to "String",
            "%t" to "Date or Time",
            "%x" to "Hexadecimal Integer")

    println("\n%[flags][width][.precision]type")

    println("\nFormatting Symbols")
    table.forEach(func)
}

Output

Using Kotlin String templates
Bob is a good chef
Bob is a great chef

%[flags][width][.precision]type

Formatting Symbols
%b	Boolean
%c	Character
%d	Signed Integer
%e	Float in scientific format
%f	Float in decimal format
%g	Float in either decimal or scientific notation based on value
%h	Hashcode of argument
%n	Line separator
%o	Octal Integer
%s	String
%t	Date or Time
%x	Hexadecimal Integer

Kotlin Regex Pattern Matching

Matching Strings using regular expressions (REGEX) is a difficult topic. Regex strings are often difficult to understand and debug. They often require extensive testing to make sure that the regex is matching what it is supposed to match.

Kotlin goes out of its way to avoid making developers use regex. For example, the split() method of String does not require a regex (unlike its Java counterpart). Doing so reduces bugs and helps keep the code more readable in general. When we need to use a regex, Kotlin has an explicit Regex type.

One advantage of having a regex type is that code is immediately more readable.

val regex = """\d{5}""".toRegex()

Notice a few things about this String. First, we use the triple quoted, or raw, string to define the regular expression. This helps us avoid bugs caused by improper escaping of the regex string. Also, the string has a toRegex() method that converts the String to a Regex object.

The Regex object comes packed with its own methods that are used for pattern matching.

regex.containsMatchIn("My string 00000")
regex.findAll("00000, 000121, 23213")

Of course there are many other methods found on the Regex object, but see the Kotlin documentation for more details: http://kotlinlang.org/api/latest/jvm/stdlib/kotlin.text/-regex/index.html

Regex Tables

Below are some common regex symbols, meta symbols, and quantifiers as presented in Oracle Certified Professional Java SE 7 Programmer Exams 1Z0-804 and 1Z0-805 A Comprehensive OCPJP 7 Certification Guide by Ganesh and Sharma.

Common Symbols

Matches either x or y
Symbol Meaning
^expr Matches expr at beginning of the line
expr$ Matches expr at end of line
. Matches any single character (exception the newline character)
[xyz] Matches either x, y, or z
[p-z] Matches either any character from p to z or any digit from 1 to 9
[^p-z] ‘^’ as the first character negates the pattern. This will match anything outside of the range p-z
xy Matches x followed by y
x|y

Common Meta Symbols

\d Matches digits ([0-9])
\D Matches non-digits
\w Matches word characters
\W Matches non-word characters
\s Matches whitespaces [\t\r\f\n]
\S Matches non-whitespaces
\b Matches word boundary when outside of a bracket. Matches backslash when placed in a bracket
\B Matches non-word boundary
\A Matches beginning of string
\Z Matches end of String

Common Quantifiers

expr? Matches 0 or 1 occurrence of expr (expr{0,1})
expr* Matches 0 or more occurrences of expr (expr{0,})
expr+ Matches 1 or more occurrences of expr (expr{1,})
expr{x, y} Matches between x and y occurrences of expr
expr{x, } Matches x or more occurrences of expr

Putting it Together

Here is an example program that uses Regex in Kotlin.

fun main(args : Array<String>){
    val symbols = mapOf(
            "^expr" to "Matches expr at beginning of the line",
            "expr$" to "Matches expr at end of line",
            "." to "Matches any single character (exception the newline character)",
            "[xyz]" to "Matches either x, y, or z",
            "[p-z]" to "Specifies a range. Matches any character from p to z",
            "[p-z1-9]" to "Matches either any character from p to z or any digit from 1 to 9",
            "[^p-z]" to "'^' as the first character negates the pattern. This will match anything outside of the range p-z",
            "xy" to "Matches x followed by y",
            "x|y" to "Matches either x or y")

    val metaSymbols = mapOf(
            "\\d" to "Matches digits ([0-9])",
            "\\D" to "Matches non-digits",
            "\\w" to "Matches word characters",
            "\\W" to "Matches non-word characters",
            "\\s" to "Matches whitespaces [\\t\\r\\f\\n]",
            "\\S" to "Matches non-whitespaces",
            "\\b" to "Matches word boundary when outside of a bracket. Matches backslash when placed in a bracket",
            "\\B" to "Matches non-word boundary",
            "\\A" to "Matches beginning of string",
            "\\Z" to "Matches end of String")


    val quantifiers = mapOf(
            "expr?" to "Matches 0 or 1 occurrence of expr (expr{0,1})",
            "expr*" to "Matches 0 or more occurrences of expr (expr{0,})",
            "expr+" to "Matches 1 or more occurrences of expr (expr{1,})",
            "expr{x, y}" to "Matches between x and y occurrences of expr",
            "expr{x, }" to "Matches x or more occurrences of expr")

    val format = "%-10s\t%s"
    val func = {entry : Map.Entry<String, String> -> println(format.format(entry.key, entry.value)) }

    println("Symbols")
    symbols.entries.forEach(func)

    println("\nMeta Symbols")
    metaSymbols.entries.forEach(func)

    println("\nQuantifiers")
    quantifiers.entries.forEach(func)

    //Create a regex object
    println("\nTesting regex: ^Matches")
    val regex = "^Matches".toRegex()
    symbols.entries.forEach({it ->
        //The Regex Type has a Number of Pattern Matching Methods
        val matchResult = regex.containsMatchIn(it.value)
        println("$matchResult => ${it.value}")
    })
}

Output

Symbols
^expr     	Matches expr at beginning of the line
expr$     	Matches expr at end of line
.         	Matches any single character (exception the newline character)
[xyz]     	Matches either x, y, or z
[p-z]     	Specifies a range. Matches any character from p to z
[p-z1-9]  	Matches either any character from p to z or any digit from 1 to 9
[^p-z]    	'^' as the first character negates the pattern. This will match anything outside of the range p-z
xy        	Matches x followed by y
x|y       	Matches either x or y

Meta Symbols
\d        	Matches digits ([0-9])
\D        	Matches non-digits
\w        	Matches word characters
\W        	Matches non-word characters
\s        	Matches whitespaces [\t\r\f\n]
\S        	Matches non-whitespaces
\b        	Matches word boundary when outside of a bracket. Matches backslash when placed in a bracket
\B        	Matches non-word boundary
\A        	Matches beginning of string
\Z        	Matches end of String

Quantifiers
expr?     	Matches 0 or 1 occurrence of expr (expr{0,1})
expr*     	Matches 0 or more occurrences of expr (expr{0,})
expr+     	Matches 1 or more occurrences of expr (expr{1,})
expr{x, y}	Matches between x and y occurrences of expr
expr{x, } 	Matches x or more occurrences of expr

Testing regex: ^Matches
Disconnected from the target VM, address: '127.0.0.1:61983', transport: 'socket'
true => Matches expr at beginning of the line
true => Matches expr at end of line
true => Matches any single character (exception the newline character)
true => Matches either x, y, or z
false => Specifies a range. Matches any character from p to z
true => Matches either any character from p to z or any digit from 1 to 9
false => '^' as the first character negates the pattern. This will match anything outside of the range p-z
true => Matches x followed by y
true => Matches either x or y

Rerences

Ganesh, S G., and Tushar Sharma. Oracle Certified Professional Java SE 7 Programmer Exams 1Z0-804 and 1Z0-805 A Comprehensive OCPJP 7 Certification Guide. Apress, 2013.

Kotlin String Splitting

Most programming tasks require string splitting. For example, CSV files often separate data based on the comma character, which requires developers to split each line based on the comma in order to extract data. Extracting domain names from a web address is another common use case for String splitting. For example, we might have the address https://stonesoupprogramming.com and we wish to separate the https:// portion of the string. We can split the string into a list where the first part contains http:// and the second index contains stonesoupprogramming.com.

In Kotlin, we use the split() method defined in the String class. It comes in two flavors. One flavor takes the character to split the string on, and the other flavor takes a Regex. Both versions of the split method return a list that contains all potions of the String.

Non-Regex Splitting

The first version of split() takes a varargs parameter of delimiters, an optional boolean argument to ignoreCase and an optional limit argument that restricts how many times the split happens.

val str = "I smell fear on you"
val parts = str.split(" ")
val partsTwo = str.split("I", "fear", "you")
val partsThree = str.split("I", true)
val partsFour = str.split(delimiters = " ", limit = 2)

All versions of split return a list. It’s worth keeping in mind that the returned list will not contain any of the delimiters passed to the delimiters argument in split(). Normally, that isn’t a problem. For example, would you really want the ‘,’ character for all fields in a CSV file?

Regex Version

Most programming languages treat regular expressions, REGEX, as a String. Doing so often leads to unexpected bugs. Consider Java’s String.split() method.

String myString = "Green. Eggs. Ham.";
String [] parts = myString.split(".");

You may think that parts holds {“Green”, “Eggs”, “Ham”}. It doesn’t. The period character is treated as a regex expression that matches to any character. It’s a very common mistake.

Thankfully, Kotlin treats regular expressions as its own type. When we want to use a Regex in Kotlin, we need to create a Regex object. The Kotlin String class has a toRegex() function that performs the conversion from String to Regex.

val str = "Green. Eggs. Ham"
val partsNonRegex = str.split(".") //No Regex. This will split on the period character
val partsRegex = str.split(".".toRegex()) //Now using REGEX matching

Putting it together

As always, we will conclude with an example program that demonstrates the topic. Many of my students are given assignments where they need to track the number of unique words in a String. We will use String splitting and maps to accomplish the goal.

fun main(args : Array<String>){
    val paragraph = """
        |I am Sam.
        |Sam I am.

        |That Sam-I-am!
        |That Sam-I-am!
        |I do not like
        |That Sam-I-am!

        |Do you like
        |Green eggs and ham?

        |I do not like them,
        |Sam-I-Am
        |I do not like
        |Green eggs and ham.
        """.trimMargin()

    //Remove all end line characters and then split the string on the space character
    val parts = paragraph.replace('\n', ' ').split(" ")
    
    //Create an empty mutable map
    val uniqueWords = mutableMapOf<String, Int>()
    
    //Populate the map
    parts.forEach( { it -> uniqueWords[it] = uniqueWords.getOrDefault(it, 0) + 1 })
    
    //Print each word with it's count value
    println(uniqueWords)
}

Here is the output when run.

{I=5, am=1, Sam.=1, Sam=1, am.=1, =3, That=3, Sam-I-am!=3, do=3, not=3, like=4, Do=1, you=1, Green=2, eggs=2, and=2, ham?=1, them,=1, Sam-I-Am=1, ham.=1}