👀 Watch Rust 🦀 live coding videos on our YouTube Channel.

Introduction #

This tutorial is part of a collection tutorials on basic data structures and algorithms that are created using Kotlin. This project is useful if you are trying to get more fluency in Kotlin or need a refresher to do interview prep for software engineering roles.

How to run this project #

You can get the code for this and all the other tutorials in this collection from this github repo. Here’s a screen capture of project in this repo in action.

Once you’ve cloned the repo, type ./gradlew run in order to build and run this project from the command line.

Importing this project into JetBrains IntelliJ IDEA #

  • This project was created using JetBrains Idea as a Gradle and Kotlin project (more info). - When you import this project into Idea as a Gradle project, make sure not to check “Offline work” (which if checked, won’t allow the gradle dependencies to be downloaded). - As of Jun 24 2018, Java 10 doesn’t work w/ this gradle distribution (v4.4.x), so you can use Java 9 or 8, or upgrade to a newer version of gradle (4.8+).

It is a very common problem to search for the presence of a substring inside of a string. Let’s say that:

  • the string is called str (with length n)
  • the substring is substr (with length m).

O(n * m) - Brute force approach #

The brute force approach to string searches is done by simply sliding the pattern along the string until you find a match. Every time you slide down to the next character in str, this algorithm doesn’t really remember what it already knows about the string (since it iterates thru the entire length of the string at every iteration). It essentially forgets about what it knows about the string (already) for every n-1 attempts that it makes to match the substring.

/**
 * O(m * n), where m = str.size, and n = substr.size.
 *
 * This is an inefficient brute force algorithm which has quadratic complexity O(n^2).
 */
fun substring(str: CharArray, substr: CharArray): Int {
    // substr can't be longer than str.
    if (substr.size > str.size) return 0

    // Iterate str using cursor1 and for each index look ahead to see if matches exist
    // for substr.
    var occurrences = 0
    for (cursor1 in 0 until str.size) {
        var matchCount = 0
        for (cursor2 in 0 until substr.size) {
            if (str[cursor1 + cursor2] == substr[cursor2]) matchCount++
        }
        // Found a match.
        if (matchCount == substr.size) occurrences++
    }

    return occurrences
}

Please note that if you want to turn a Kotlin String into a CharArray you can use something like "Hello world".toCharArray(). Here’s more info about this on stackoverflow.

O(n + m) - Using a state machine #

By using a state machine that is built from the substring pattern we can come up with a much better algorithm to match these patterns inside our string. The idea here is not to forget what we have already seen about the string as we iterate over each character in it.

This is a streaming algorithm where we pass one character at a time (as we iterate thru the entire string) to a state machine which matches the pattern in the substring. For every iteration:

  • Each character is compared with the character at a cursor (which represents the state) in the substring.
  • If there’s a match, this cursor is incremented, and at the next iteration, the next character in the pattern will be matched, and so on.
  • When there’s a mismatch the cursor is reset to 0.
  • And we know a match has been found when the cursor equals the length of the substring.

This approach is based on the idea of Deterministic Finite Automaton.

/**
 * O(m + n), where m = str.size, and n = substr.size
 *
 * This function uses a deterministic finite automation (DFA) method
 * which entails the use of a state machine to keep track of progress
 * in a game.
 */
fun substring_optimized(str: CharArray, substr: CharArray): Int {

    class StateMachine(val pattern: CharArray) {
        var cursor = 0
        fun add(character: Char) {
            if (pattern[cursor] == character) cursor++
            else cursor = 0
        }
        fun isMatch() = cursor == pattern.size
        fun reset() {cursor = 0}
    }

    val stateMachine = StateMachine(substr)
    var numberOfOccurrences = 0

    for (cursor in 0 until str.size) {
        stateMachine.add(str[cursor])
        if (stateMachine.isMatch()) {
            stateMachine.reset()
            numberOfOccurrences++
        }
    }

    return numberOfOccurrences

}

Resources #

CS Fundamentals #

Data Structures #

Math #

Big-O Notation #

Kotlin #

Markdown utilities #

📦 Install our useful Rust command line apps using cargo install r3bl-cmdr (they are from the r3bl-open-core project):
  • 🐱giti: run interactive git commands with confidence in your terminal
  • 🦜edi: edit Markdown with style in your terminal

giti in action

edi in action

Related Posts