developerlife.com

Build with Naz : Box and Pin exploration in Rust

2024-07-16T10:00:00-05:00

Introduction
Why do we need both Box and Pin?
Formatting pointers
What is a smart pointer?
YouTube video for this article
Examples Rust Box smart pointer, and Pin
Build with Naz video series on developerlife.com YouTube channel

Introduction

This tutorial, video, and repo are a deep dive into Rust Pin and Box types, along with concepts of ownership and borrowing. We will also cover a lot of background information on the concepts of operating system process, memory allocation and access, stack, and heap. The examples we create are designed to demonstrate the different semantics around the use of boxes and pinned boxes in Rust.

Why do we need both Box and Pin?

It is common to use Pin for tokio::select! macro branches in Rust async code. And Box is used commonly for trait pointers.

This article, video, and repo illustrate the concepts (moving a box, swapping box contents, and pinning a box) by example. Lots of pretty formatted output is generated so that you can run tests and see what’s happening (and make sense of it).

Formatting pointers

To format pointers in Rust, we can use the formatting trait {:p}. You can format a pointer by using two approaches:

Get the address of the pointer using [std::ptr::addr_of!] and then format it using {:p}. Eg: let x = 1; println!("{:p}", std::ptr::addr_of!(x));
Get a reference to the pointer using & and then format it using {:p}. Eg: let x = 1; println!("{:p}", &x);

What is a smart pointer?

Smart pointers in Rust are data structures that act like pointers but also have additional metadata and capabilities. They provide a level of abstraction over raw pointers, offering features like ownership management, reference counting, and more. Smart pointers often manage ownership of the data they point to, ensuring proper deallocation when no longer needed.

For a great visualization of memory allocation, stack and heap please read this article.

YouTube video for this article

This blog post has examples from this live coding video. If you like to learn via video, please watch the companion video on the developerlife.com YouTube channel.

Examples Rust Box smart pointer, and Pin

Let’s create some examples to illustrate how Box and Pin and pointers to stack allocations and heap allocations work in Rust. You can run cargo new --lib box-and-pin to create a new library crate.

💡 You can get the code from the rust-scratch repo.

Then add the following to the Cargo.toml file that’s generated. These pull in all the dependencies that we need for these examples.

[package]
name = "box-and-pin"
version = "0.1.0"
edition = "2021"

[dependencies]
crossterm = { version = "0.27.0", features = ["event-stream"] }
serial_test = "3.1.1"

Here are the dependencies we are using:

The serial_test dep allows us to run Rust tests serially, so that we can examine the output of each test, without it being clobbered by other test output running in parallel.
The crossterm dep allows us to generate colorful println! output in the terminal which will help us visualize what is going on with the pointers and memory allocations.

We are going to add all the examples below as tests to the lib.rs file in this crate.

Example 1: Getting the address of variables on the stack and heap

Let’s add the following imports and macros to the top of the lib.rs file. These will help us print output from the tests, so that we can track where a pointer is located in memory and what the size of the thing it points to is. There are two macros, one for a reference or pointer, and another one for pinned pointers. And we have an assertion function that can return true if all 3 arguments are equal.

use crossterm::style::Stylize;
use serial_test::serial;

/// Given a pointer `$p`, it prints:
/// 1. it's address,
/// 2. and size of the thing it points to (in bytes).
macro_rules! print_ptr_addr_size {
    ($p: expr) => {
        format!("{:p}┆{}b", $p, std::mem::size_of_val($p))
    };
}

/// Given a pinned pointer `$p`, it prints:
/// 1. it's address,
/// 2. and size of the thing it points to (in bytes).
macro_rules! print_pin_addr_size {
    ($p: expr) => {
        format!("{:p}┆{}b", $p, std::mem::size_of_val(&(*$p)))
    };
}

fn assert_three_equal<T: PartialEq + std::fmt::Debug>(a: &T, b: &T, c: &T) {
    assert_eq!(a, b, "a and b are not equal");
    assert_eq!(a, c, "a and c are not equal");
}

So, before we start with the examples, let’s add a test that demonstrates how to get the address of a variable on the stack and heap. Add the following code to your lib.rs file.

#[test]
#[serial]
fn print_ptr_addr_size() {
    // Using `std::ptr::addr_of!` to get the memory address of a variable.
    let x = 100u8;
    let x_addr = std::ptr::addr_of!(x);
    println!(
        "x: {}, x_addr  : {}",
        x.to_string().blue().underlined(),
        format!("{:?}", x_addr).red().italic(),
    );

    // Using `format!` to get the memory address of a variable.
    let x_addr_2 = format!("{:p}", &x);
    println!(
        "x: {}, x_addr_2: {}",
        x.to_string().blue().underlined(),
        x_addr_2.red().italic().on_black(),
    );

    // Get size of `x` in bytes.
    let x_size = std::mem::size_of_val(&x);
    println!(
        "x: {}, x_size  : {}b",
        x.to_string().blue().underlined(),
        x_size.to_string().magenta().italic().on_black(),
    );

    // Using `print_ptr_addr_size!` to get the memory address of a variable.
    let x_addr_3 = print_ptr_addr_size!(&x);
    println!(
        "x: {}, x_addr_3: {}",
        x.to_string().blue().underlined(),
        x_addr_3.red().italic().on_black(),
    );
}

Here’s the output of the test above, after you run cargo watch -x "test --lib -- --show-output print".

---- print_ptr_addr_size stdout ----
x: 100, x_addr  : 0x7e17cd9feb97
x: 100, x_addr_2: 0x7e17cd9feb97
x: 100, x_size  : 1b
x: 100, x_addr_3: 0x7e17cd9feb97┆1b

Let’s walk through the output above:

We have a variable x that is a u8 with a value of 100. This is a stack allocation. It occupies 1 byte of memory (its size).
We get the address of x using std::ptr::addr_of!(x) and format!("{:p}", &x).
We get the size of x in bytes using std::mem::size_of_val(&x). The size is 1 byte.
We get the address of x and the size of x using the print_ptr_addr_size! macro.

Example 2: What does Box move do?

Add the following snippet to the lib.rs file next. This link provids lots of great diagrams on how stack and heap memory works in an operating system.

/// 
#[test]
#[serial]
fn move_a_box() {
    let b_1 = Box::new(255u8);
    let b_1_addr = print_ptr_addr_size!(b_1.as_ref()); // Pointee (heap)
    let b_1_ptr_addr = print_ptr_addr_size!(&b_1); // Pointer (stack)

    println!(
        "1. {}: {}, {} (pointee, heap): {}, {} (ptr, stack): {}",
        "b_1".green(),
        b_1.to_string().blue().underlined(),
        "b_1_addr".green(),
        b_1_addr.clone().magenta().italic().on_black(),
        "b_1_ptr_addr".green(),
        b_1_ptr_addr.clone().magenta().italic().on_black(),
    );

    let b_2 = b_1;
    // println!("{b_1:p}"); // ⛔ error: use of moved value: `b_1`
    let b_2_addr = print_ptr_addr_size!(b_2.as_ref()); // Pointee (heap)
    let b_2_ptr_addr = print_ptr_addr_size!(&b_2); // Pointer (stack)

    println!(
        "2. {}: {}, {} (pointee, heap): {}, {} (ptr, stack): {}",
        "b_2".green(),
        b_2.to_string().blue().underlined(),
        "b_2_addr".green(),
        b_2_addr.clone().magenta().italic().on_black(),
        "b_2_ptr_addr".green(),
        b_2_ptr_addr.clone().magenta().italic().on_black(),
    );

    // The heap memory allocation does not change (does not move). Pointee does not move.
    assert_eq!(b_1_addr, b_2_addr);

    // The stack memory allocation does change (does move). Boxes aka pointers have move.
    assert_ne!(b_1_ptr_addr, b_2_ptr_addr);

    // When b_2 is dropped, the heap allocation is deallocated. This is why Box is a smart pointer.
}

Let’s walk through the output above:

We have a Box b_1 that points to a heap allocation of a u8 with a value of 255. b_1 is a variable on the stack that points to a heap allocation. We get the address of the pointee and the pointer using the print_ptr_addr_size! macro with b_1.as_ref(). And we get the address of the pointer by passing &b_1 to print_ptr_addr_size!.
We move b_1 into b_2. The heap memory allocation does not change (does not move). The pointee does not move. But the stack memory allocation does change (does move). Boxes aka pointers have moved. The b_1 variable gets dropped. We can get the address of the pointee using print_ptr_addr_size! macro with b_2.as_ref(). We can get the address of the pointer using print_ptr_addr_size! macro with &b_2.
In the assertions, we check that the heap memory allocation does not change (does not move). And we check that the stack memory allocation does change (does move).

Example 3: How do we swap the contents of two boxes?

Add the following snippet to the lib.rs file next.

#[test]
#[serial]
fn swap_box_contents() {
    let mut b_1 = Box::new(100u8);
    let mut b_2 = Box::new(200u8);

    let og_b_1_addr = print_ptr_addr_size!(b_1.as_ref());
    let og_b_2_addr = print_ptr_addr_size!(b_2.as_ref());

    assert_eq!(*b_1, 100u8);
    assert_eq!(*b_2, 200u8);

    println!(
        "1. {}: {}, {} (pointee, heap): {}, {} (ptr, stack): {}",
        "b_1".green(),
        b_1.to_string().blue().underlined(),
        "b_1_addr".green(),
        og_b_1_addr.clone().red().italic().on_black(),
        "b_1_ptr_addr".green(),
        print_ptr_addr_size!(&b_1)
            .clone()
            .magenta()
            .italic()
            .on_black(),
    );
    println!(
        "2. {}: {}, {} (pointee, heap): {}, {} (ptr, stack): {}",
        "b_2".green(),
        b_2.to_string().blue().underlined(),
        "b_2_addr".green(),
        og_b_2_addr.clone().magenta().italic().on_black(),
        "b_2_ptr_addr".green(),
        print_ptr_addr_size!(&b_2)
            .clone()
            .cyan()
            .italic()
            .on_black(),
    );

    std::mem::swap(&mut b_1, &mut b_2);
    println!("{}", "Swapped b_1 and b_2".cyan().underlined());

    let new_b_1_addr = print_ptr_addr_size!(b_1.as_ref());
    let new_b_2_addr = print_ptr_addr_size!(b_2.as_ref());

    assert_eq!(*b_1, 200u8);
    assert_eq!(*b_2, 100u8);

    assert_eq!(og_b_1_addr, new_b_2_addr);
    assert_eq!(og_b_2_addr, new_b_1_addr);

    println!(
        "3. {}: {}, {} (pointee, heap): {}, {} (ptr, stack): {}",
        "b_1".green(),
        b_1.to_string().blue().underlined(),
        "b_1_addr".green(),
        new_b_1_addr.clone().magenta().italic().on_black(),
        "b_1_ptr_addr".green(),
        print_ptr_addr_size!(&b_1)
            .clone()
            .magenta()
            .italic()
            .on_black(),
    );
    println!(
        "4. {}: {}, {} (pointee, heap): {}, {} (ptr, stack): {}",
        "b_2".green(),
        b_2.to_string().blue().underlined(),
        "b_2_addr".green(),
        new_b_2_addr.clone().red().italic().on_black(),
        "b_2_ptr_addr".green(),
        print_ptr_addr_size!(&b_2)
            .clone()
            .cyan()
            .italic()
            .on_black(),
    );
}

Here’s the output of the test above, after you run cargo watch -x "test --lib -- --show-output swap".

---- swap_box_contents stdout ----
1. b_1: 100, b_1_addr (pointee, heap): 0x722b38000d10┆1b, b_1_ptr_addr (ptr, stack): 0x722b3cbfdad0┆8b
2. b_2: 200, b_2_addr (pointee, heap): 0x722b38001f30┆1b, b_2_ptr_addr (ptr, stack): 0x722b3cbfdad8┆8b
Swapped b_1 and b_2
3. b_1: 200, b_1_addr (pointee, heap): 0x722b38001f30┆1b, b_1_ptr_addr (ptr, stack): 0x722b3cbfdad0┆8b
4. b_2: 100, b_2_addr (pointee, heap): 0x722b38000d10┆1b, b_2_ptr_addr (ptr, stack): 0x722b3cbfdad8┆8b

Let’s walk through the output above:

We have two Boxes b_1 and b_2 that point to heap allocations of u8 with values 100 and 200 respectively. We get the address of the pointees using the print_ptr_addr_size! macro with b_1.as_ref() and b_2.as_ref(). We get the address of the pointers using the print_ptr_addr_size! macro with &b_1 and &b_2.
We swap the contents of b_1 and b_2 using std::mem::swap(&mut b_1, &mut b_2). The values of b_1 and b_2 are now 200 and 100 respectively.
We get the new addresses of the pointees using the print_ptr_addr_size! macro with b_1.as_ref() and b_2.as_ref(). We get the new addresses of the pointers using the print_ptr_addr_size! macro with &b_1 and &b_2.
In the assertions, we check that the values of b_1 and b_2 are 200 and 100 respectively. We check that the addresses of the pointees have swapped. And we check that the addresses of the pointers have not swapped.

Example 4: What does pining a box do?

Add the following code to your lib.rs file.

fn box_and_pin_dynamic_duo() {
    let b_1 = Box::new(100u8);
    // Pointee.
    let b_1_addr = print_ptr_addr_size!(b_1.as_ref());

    let p_b_1 = std::boxed::Box::<u8>::into_pin(b_1);
    // Pinned.
    let p_b_1_addr = print_pin_addr_size!(p_b_1);

    let b_2 = p_b_1;
    // println!("{}", p_b_1); // ⛔ error: use of moved value: `p_b_1`

    // Pin does not move.
    let b_2_addr = print_pin_addr_size!(b_2);

    // Pointee has not moved!
    assert_eq!(b_1_addr, b_2_addr);

    // Pointer has not moved!
    assert_three_equal(&b_1_addr, &p_b_1_addr, &b_2_addr);
}

When you run the command cargo watch -x "test --lib -- --show-output dynamic" it doesn’t really produce any output.

Let’s walk through the code above:

We have a Box b_1 that points to a heap allocation of a u8 with a value of 100. We get the address of the pointee using the print_ptr_addr_size! macro with b_1.as_ref().
We pin b_1 into p_b_1 using std::boxed::Box::::into_pin(b_1). The pointee does not move. We get the address of the pinned pointer using the print_pin_addr_size! macro with p_b_1.
We move p_b_1 into b_2. The pin does not move. We get the address of the pinned pointer using the print_pin_addr_size! macro with b_2.
In the assertions, we check that the pointee has not moved. And we check that the pointer has not moved.

Build with Naz video series on developerlife.com YouTube channel

If you have comments and feedback on this content, or would like to request new content (articles & videos) on developerlife.com, please join our discord server.

You can watch a video series on building this crate with Naz on the developerlife.com YouTube channel.

Build with Naz : Rust async in practice tokio::select!, actor pattern & cancel safety

2024-07-10T10:00:00-05:00

Introduction
What can go wrong when racing futures?
YouTube video for this article
Examples of cancellation safety in async Rust using tokio::select!
Build with Naz video series on developerlife.com YouTube channel

Introduction

This tutorial, video, and repo are a deep dive into the concept of cancellation safety in async code using Tokio and Rust. It affects the tokio::select! macro, and what happens to the racing Futures that don’t win. The examples provided here, along with the video, will go over both code that is is cancellation safe and code that is not. These examples reflect real-world patterns, and are a generalized form of them.

tokio::select! might as well have been called tokio::race! (there’s a The Fast and Furious : Tokyo Drift joke in there somewhere).

It races the given futures in the branches of the macro, and the first one to resolve wins (it is Ready when poll()ed). The other futures are dropped. These futures are run concurrently, not in parallel, on the same worker thread, since we are not using tokio::spawn! or its variants.

Here’s the basic setup:

loop {
    tokio::select!{
        branch_1_result = future_1 => {
            // handle branch_1_result
        },
        branch_2_result = future_2 => {
            // handle branch_2_result
        },
        // and so on
    }
}

A classic example is that you’re reading something from an async network or file stream. And you want to have a timeout that breaks out of the loop if it takes too long. In this case you might have two branches:

A tokio::time::sleep() Future in the timeout branch.
Some code to get the data asynchronously from the stream in the other branch.

Another example is that you might be waiting for the user to type something from the keyboard or mouse (such as a TUI app) and also listen for signals to shut down the app, or other signals to perform re-rendering of the TUI. You can see this in r3bl_tui here and in r3bl_terminal_async here.

Note that all branches must have a Future to call .await on. The macro does not require you to call .await. The code it generates take care of this.

It might be worth your time (if you haven’t already) to read the official Tokio docs on tokio::select! macro and the concept of cancellation safety before diving into the examples below.

What can go wrong when racing futures?

If you recall, in Rust, a Future is just a data structure that doesn’t really do anything until you .await it.

The Tokio runtime actually does work on the Futures by polling them to see whether they are Ready or Pending.
If they’re not Ready they go back to waiting until their Waker is called, and then Tokio will poll() them again.
They are cheap to create, they are stateful, and they can be nested (easily composed).

Please read our article on effective async Rust to get a better understanding of how async Rust, and Futures works and how runtimes are implemented.

These are some of the great things about Rust Futures. However, the nature of a Rust Future is what may cause a problem with “cancellation safety” in the tokio::select! macro.

So what happens to future_2 (the branch reading or writing from an async stream) if the timeout branch (for future_1) wins the race?

Is the future_2 in the middle of doing something when this happens?
And if so, what happens to the work it was doing when it hits the .await point in its code, and then stops?

This is the crux of the issue with cancellation safety in async Rust code. Lots of tokio code is built to be cancellation safe, so if you’re using mpsc or broadcast channels, async streams, etc. you will be fine. However if you’re maintaining state inside the future_2 and then it is dropped, then this article will help you understand what happens.

YouTube video for this article

This blog post has examples from this live coding video. If you like to learn via video, please watch the companion video on the developerlife.com YouTube channel.

Examples of cancellation safety in async Rust using tokio::select!

Let’s create some examples to illustrate how to use the typestate pattern in Rust. You can run cargo new --lib async_cancel_safe to create a new library crate.

💡 You can get the code from the rust-scratch repo.

Then add the following to the Cargo.toml file that’s generated. These pull in all the dependencies that we need for these examples.

[package]
name = "async_cancel_safe"
version = "0.1.0"
edition = "2021"

[dependencies]
tokio = { version = "1.38.0", features = ["full"] }

# Async stream testing.
r3bl_test_fixtures = { version = "0.0.2" }
futures-util = "0.3.30"

We are going to add all the examples below as tests to the lib.rs file in this crate.

Example 1: Right and wrong way to sleep, and interval

Add the following code to your lib.rs file. Both these examples show similar ways of using tokio::time::sleep(..) incorrectly in a tokio::select! block.

/// Equivalent to [test_sleep_right_and_wrong_ways_v2]. This test uses
/// [`tokio::pin`] and [`tokio::time::sleep`].
/// Run the test using:
/// `cargo test -- --nocapture test_sleep_right_and_wrong_ways_v1`
#[tokio::test]
async fn test_sleep_right_and_wrong_ways_v1() {
    let mut count = 5;

    let sleep_time = 100;
    let duration = std::time::Duration::from_millis(sleep_time);

    let sleep = tokio::time::sleep(duration);
    tokio::pin!(sleep);

    loop {
        tokio::select! {
            // Branch 1 (right way)
            // This branch executes a deterministic number of times. The same
            // sleep future is re-used on each iteration.
            _ = &mut sleep => {
                println!("branch 1 - tick : {count}");
                count -= 1;
                if count == 0 {
                    break;
                }
            }

            // Branch 2 (wrong way)
            // This branch is executed a non deterministic number of times.
            // This is because the sleep future is not pinned. It is dropped
            // when the other branch is executed. Then on the next iteration,
            // a new sleep future is created.
            _ = tokio::time::sleep(duration) => {
                println!("branch 2 - sleep");
            }
        }
    }
}

/// Equivalent to [test_sleep_right_and_wrong_ways_v1]. This test uses
/// [`tokio::time::interval()`]
/// Run the test using:
/// `cargo test -- --nocapture test_sleep_right_and_wrong_ways_v2`
#[tokio::test]
async fn test_sleep_right_and_wrong_ways_v2() {
    let mut count = 5;

    let sleep_time = 100;
    let duration = std::time::Duration::from_millis(sleep_time);

    let mut interval = tokio::time::interval(duration);

    loop {
        tokio::select! {
            // Branch 1 (right way)
            // This branch executes a deterministic number of times. The same
            // sleep future is re-used on each iteration.
            _ = interval.tick() => {
                println!("branch 1 - tick : {count}");
                count -= 1;
                if count == 0 {
                    break;
                }
            }

            // Branch 2 (wrong way)
            // This branch is executed a non deterministic number of times.
            // This is because the sleep future is not pinned. It is dropped
            // when the other branch is executed. Then on the next iteration,
            // a new sleep future is created.
            _ = tokio::time::sleep(duration) => {
                println!("branch 2 - sleep");
            }
        }
    }
}

You can run these tests to see what they do by running the following in your terminal:

cargo test -- --nocapture test_sleep_right_and_wrong_ways_v1
cargo test -- --nocapture test_sleep_right_and_wrong_ways_v2

They are flaky and its not possible to really make accurate assertions at the end of each of these tests.

Let’s break down v1 first to see what is happening.

Branch 1 (right way): This branch executes a deterministic number of times. The same sleep future is re-used on each iteration. This is achieved using the tokio::pin! macro. Since futures are stateful, ensuring that the same one is re-used between iterations of the loop ensures that state isn’t lost when the other branch is executed, or when this branch finishes and its future is dropped.
Branch 2 (wrong way): This branch is executed a non deterministic number of times. This is because the sleep future is not pinned. It is dropped when the other branch is executed. Then on the next iteration, a new sleep future is created. This means that the state of the future is lost, and its behavior with providing a reliable delay is non deterministic.

Let’s break down v2 next.

Branch 1 (right way): This branch executes a deterministic number of times. However, we are using tokio::time::interval() this time around. It is re-used between many iterations of the loop. This function returns a Interval struct that has a tick() method that returns a Future that resolves when the interval has elapsed.
Branch 2 (wrong way): Same as before.

Difference between interval and sleep

This is the mental model that I’ve developed for using these.

If your intention is to have a single timeout then, sleep might be the way to go. You create and tokio::pin! the sleep future, and then re-use it in the loop. Once this timeout expires, then you can handle your timeout condition in that branch.
If your intention is to have a re-usable timer that ticks on a regular interval, then interval is the way to go. You create the interval outside the loop, and then call tick() on it in the loop. This will give you a Future that resolves when the interval has elapsed. And you can safely use this same Interval repeatedly in the loop. And even accumulate how many times it runs to decide when to break.

Example 2: Safe cancel of a future using interval and mpsc channel

Add the following snippet to your lib.rs file.

/// Run the test using:
/// `cargo test -- --nocapture test_safe_cancel_example`
#[tokio::test]
async fn test_safe_cancel_example() {
    let sleep_time = 100;
    let duration = std::time::Duration::from_millis(sleep_time);

    let mut count = 5;
    let mut interval = tokio::time::interval(duration);

    // Shutdown channel.
    let (tx, mut rx) = tokio::sync::mpsc::channel(1);
    let mut vec: Vec<usize> = vec![];

    loop {
        tokio::select! {
            // Branch 1.
            _ = interval.tick() => {
                println!("branch 1 - tick : count {}", count);

                vec.push(count);
                count = count.saturating_sub(1);

                if count == 0 {
                    _ = tx.try_send(());
                }
            }
            // Branch 2.
            _ = rx.recv() => {
                println!("branch 2 => shut down");
                break;
            }
        }
    }

    assert_eq!(vec, vec![5, 4, 3, 2, 1]);
}

When you run this test using cargo test -- --nocapture test_safe_cancel_example, you should get this output in your terminal:

running 1 test
branch 1 - tick : count 5
branch 1 - tick : count 4
branch 1 - tick : count 3
branch 1 - tick : count 2
branch 1 - tick : count 1
branch 2 => shut down

Let’s break down what’s happening in this test.

Branch 1 - The interval is created outside the loop and is used to create a Future that resolves when the interval has elapsed. This happens in Branch 1 and we let this branch run 5 times before sending a message on the tx channel.
Branch 2 - The tx channel is used to send a message to the rx channel. This is done in Branch 1 when count reaches 0. The rx channel is used to receive a message. This is done in Branch 2 and when a message is received, we break out of the loop.

Branch 1 runs 5 times, and Branch 1 runs 1 time and breaks out of the loop. If you look at the vec that we accumulate outside of the loop this contains what we expect.

Example 3: Inducing cancellation safety issues

This is the example we have all been waiting for. Let’s start with copying the following snippet in your lib.rs file. We will create a new module here.

#[cfg(test)]
pub mod test_unsafe_cancel_example {
    use r3bl_test_fixtures::{gen_input_stream_with_delay, PinnedInputStream};

    pub fn get_input_vec() -> Vec<usize> {
        vec![1, 2, 3, 4]
    }

    pub fn get_stream_delay() -> std::time::Duration {
        std::time::Duration::from_millis(100)
    }

    fn get_input_stream() -> PinnedInputStream<usize> {
        gen_input_stream_with_delay(get_input_vec(), get_stream_delay())
    }

    /// This is just to see how to use the async stream [gen_input_stream()].
    #[tokio::test]
    async fn test_generate_event_stream_pinned() {
        use futures_util::StreamExt;

        let mut count = 0;
        let mut stream = get_stream();

        while let Some(item) = stream.next().await {
            let lhs = item;
            let rhs = get_input_vec()[count];
            assert_eq!(lhs, rhs);
            count += 1;
        }
    }
    // 
}

Let’s break down what’s happening here.

get_input_vec() - This function returns a Vec that we will use to generate events in the gen_input_stream() function. This is meant to simulate the stream of usize values that may be generated from reading a file or a network source. Or even write to a file or network source. We could have just made these u8, but this is a made up test, so we are using usize.
gen_input_stream() - This is where things get interesting. This function creates an async stream that yields the values from the Vec returned by get_input_vec(). It waits for 100ms between each value that it yields. This is to simulate the delay that might be present when reading from a file or network source. Note the trait magic and imports that are used to make this work; to get the details on this, check our article on trait pointers and testing.
These two functions are our test fixture to simulate a slow async stream. Now, let’s test the test fixtures in test_generate_event_stream_pinned(). This test simply reads from the async stream and compares the values that it reads with the values that are expected from the Vec returned by get_input_vec().

You can get the r3bl_test_fixtures source here. You can get the crate from crates.io.

In lib.rs replace the // with the following code:

/// There is no need to [futures_util::FutureExt::fuse()] the items in each
/// [tokio::select!] branch. This is because Tokio's event loop is designed to handle
/// this efficiently by remembering the state of each future across iterations.
///
/// More info: 
#[rustfmt::skip]
async fn read_3_items_not_cancel_safe(stream: &mut PinnedInputStream<usize>)
    -> Vec<usize>
{
    use futures_util::StreamExt;
    let mut vec = vec![];

    println!("branch 2 => entering read_3_items_not_cancel_safe");

    for _ in 0..3 {
        let item = stream.next() /* .fuse() */ .await.unwrap();
        println!("branch 2 => read_3_items_not_cancel_safe got item: {item}");
        vec.push(item);
        println!("branch 2 => vec so far contains: {vec:?}");
    }

    vec
}

/// There is no need to [futures_util::FutureExt::fuse()] the items in each
/// [tokio::select!] branch. This is because Tokio's event loop is designed to handle
/// this efficiently by remembering the state of each future across iterations.
///
/// More info: 
#[tokio::test]
async fn test_unsafe_cancel_stream() {
    use futures_util::StreamExt;

    let mut stream = get_input_stream();
    let sleep_time = 300;
    let duration = std::time::Duration::from_millis(sleep_time);
    let sleep = tokio::time::sleep(duration);
    tokio::pin!(sleep);

    loop {
        tokio::select! {
            // Branch 1 - Timeout.
            _ = &mut sleep => {
                println!("branch 1 - time is up - end");
                break;
            }
            // Branch 2 - Read from stream.
            it = read_3_items_not_cancel_safe(&mut stream) /* .fuse() */ => {
                println!("branch 2 - got 3 items: {it:?}");
            }
        }
    }

    println!("loop exited");

    // Only [1, 2] is consumed by Branch 2 before the timeout happens
    // in Branch 1.
    let it = stream.next().await.unwrap();
    assert_eq!(it, 3);
}

When you run this test using cargo test -- --nocapture test_unsafe_cancel_stream, you can expect the following output in your terminal.

branch 2 => entering read_3_items_not_cancel_safe
yielding item: 1
branch 2 => read_3_items_not_cancel_safe got item: 1
branch 2 => vec so far contains: [1]
yielding item: 2
branch 2 => read_3_items_not_cancel_safe got item: 2
branch 2 => vec so far contains: [1, 2]
branch 1 - time is up - end
loop exited
yielding item: 3

So let’s break down what’s happening in this test.

Branch 1 - This branch is a timeout branch. It waits for 300ms before breaking out of the loop. This is to simulate a timeout that might happen when reading from a file or network source. With this delay, we ensure that Branch 2 doesn’t get to read all the values from the async stream. And thus we induce a cancellation safety issue, due the way read_3_items_not_cancel_safe() is implemented.
Branch 2 - This branch needs to reads 3 items from the async stream before resolving. This is done in a loop that reads 3 items in read_3_items_not_cancel_safe(). This is not safe because if the timeout branch wins the race, then the stream is dropped and the read_3_items_not_cancel_safe() future is dropped, along with the contained vec! This means that the stream is dropped before all the items are read from it. This is the cancellation safety issue that we are inducing in this test.

There are many ways to resolve this. The key is not to hold state inside of a Future that you don’t want to lose if the Future is dropped. You can use mpsc channels or a pinned Vec to get around this issue.

Note that in the case of a graceful shutdown, where you might not care about what data in some buffer is dropped, then this is not a problem.

Build with Naz video series on developerlife.com YouTube channel

If you have comments and feedback on this content, or would like to request new content (articles & videos) on developerlife.com, please join our discord server.

You can watch a video series on building this crate with Naz on the developerlife.com YouTube channel.

Build with Naz : Ubuntu 24.04 setup and config for dev productivity

2024-07-08T10:00:00-05:00

Introduction
Related video
What comes with the scripts
Running the scripts
Gnome Extensions
Keyboard remapping
Chrome issues w/ Wayland
libfuse2 and AppImage issues
- Approach 1 - simple
- Approach 2 - complex
Settings -> Key mappings
OBS Studio issues
Fontconfig

Introduction

I’ve provided scripts for setting up a new Ubuntu 24.04 desktop machine. They have been tested on a fresh install of Ubuntu 24.04 LTS. They contain all the software that is needed for Rust development, OBS Studio use, and general developer productivity. They are highly opinionated for my use case, but you can modify them to suit your needs. I’ve also recently migrated from Pop_OS! 22.04.

This video shows what the scripts do after you run them, and how they make your Ubuntu 24.04 desktop environment look and feel. This article is really a companion to the video.

💡 You can get the scripts from the rust-scratch repo.

What comes with the scripts

Here is a non exhaustive list of software that will be installed:

fish as the default login shell. All the configuration scripts are written in fish. bass is also installed to allow for running bash scripts in fish.
rustup, brew, and flatpak with flathub as package managers.
docker and docker-compose for containerization.
obs-studio for screen recording and streaming.
vlc, mpv for media playback.
chrome for web browsing.
vscode for code editing.
Lots of Gnome extensions for desktop customization.
nerd-fonts for terminal font customization. Along with guake and tilix for terminal emulators. Along with tmux for terminal multiplexing.

To download Ubuntu 24.04, visit the Ubuntu website and prepare a USB drive with the ISO file for installation. You can use Popsicle to create a bootable USB drive.

Running the scripts

Lots of customized font configurations are included in the scripts. You can clone the repo and run the scripts, or just copy the links below and run them in your terminal.

You can run the following commands to get this on your machine. The first script 0-bootstrap.bash has to be installed first. It installs fish shell and makes it the default and installs flatpak and flatpak flathub. It also installs bass to allow running bash scripts in fish. The remainder of the scripts can be run in any order.

sudo apt install -y curl git
cd ~/Downloads/
git clone https://github.com/nazmulidris/rust-scratch/
cd rust-scratch/ubuntu24/
./0-bootstrap.bash
# You will need to reboot after running the 0-bootstrap.bash script

Once you reboot, you can run the following scripts.

cd ~/Downloads/rust-scratch/ubuntu24/
./1-install.fish
./2-install-docker.bash
./3-install-fonts.fish

You can see the contents of the scripts by clicking on the links below:

Optional scripts:

Gnome Extensions

Keyboard remapping

Tilix and quake mode

tilix and its quake mode is disabled in Wayland. So I installed guake. You can use tmux to manage the panes in this guake terminal with ease.

Remap Super+Q

The following links show you how to remove the default binding for the Super+Q key to close the current window. This is useful if you want to use that binding to launch the guake terminal. I used to use tilix in quake mode, but that doesn’t work in Wayland, so I am using guake for that now. But I’m still using tilix.

Remap Caps Lock to Ctrl

Here’s a snippet to allow you map your Caps Lock key to the Ctrl key:

function remapCapsLockKey
    echo "Remapping caps lock key"
    # https://opensource.com/article/21/5/remap-caps-lock-key-linux
    dconf write /org/gnome/desktop/input-sources/xkb-options "['caps:ctrl_modifier']"
    # dconf write /org/gnome/desktop/input-sources/xkb-options "['caps:ctrl']"
end

Chrome issues w/ Wayland

If you find Chrome to be blurry (or AppImages or Electron apps), then you may need to do the following to fix this in Wayland:

Navigate to chrome://flags
Change Preferred Ozone Platform from default to wayland
More info

libfuse2 and AppImage issues

On Ubuntu 24.04 I ran into some issues w/ libfuse2 and running AppImages.

Ubuntu 24 does not come w/ libfuse2 out of the box (for good reasons), and instead it has libfuse3.
And AppImage currently only supports libfuse2.

Approach 1 - simple

Instead of installing libfuse2, however, if you don’t want to do that you can simply run the your XYZ.AppImage using the following command (once it’s been marked as executable):

chmod +x XYZ.AppImage
./XYZ.AppImage \
  --no-sandbox --enable-features=UseOzonePlatform,WaylandWindowDecorations \
  --ozone-platform-hint=auto

Explanation of the flags:

--no-sandbox - removes the need for libfuse2
--enable-features=UseOzonePlatform,WaylandWindowDecorations - tells chromium in electron to do things for Wayland.
--ozone-platform-hint=auto - tells chromium to use Wayland if it is available. This setting is set to X11 by default. If you see blurry windows in Chrome, you may need to set this in your Chrome too (using chrome://flags/).

Here’s an example of my ~/.local/share/applications/uhk-agent.desktop file, which I use to run the uhk-agent AppImage w/ the above flags:

[Desktop Entry]
Type=Application
Name=UHK Agent
Comment=Launch UHK Agent
Categories=Utilities;
Icon=/home/nazmul/bin/uhk-agent.png
Exec=/home/nazmul/bin/UHK.Agent.AppImage --no-sandbox --enable-features=UseOzonePlatform,WaylandWindowDecorations --ozone-platform-hint=auto
Terminal=false

Approach 2 - complex

Here’s another approach to run AppImages, which is more complex:

libfuse2 is not included with Ubuntu 24.04. AppImages are difficult to run (since they need libfuse2 installed).
To run them, have to pass an extra flag in the terminal or .desktop file(eg for uhk-agent). here’s a workaround (to keep from installing libfuse2).

Here’s an example of the command to run the uhk-agent AppImage:

/UHK.Agent-4.2.0-linux-x86_64.AppImage --appimage-extract
cd squashfs-root
./uhk-agent --no-sandox

Here’s a script that uses this complex approach to unpack an AppImage into the ~/bin folder so you can run it.

Settings -> Key mappings

To create keyboard shortcuts that launch a shell command, wrap it in sh -c $CMD. This is what must be done for flameshot, and ulauncher.
Bind ulauncher-toggle to the settings -> keyboard shortcuts in gnome.

OBS Studio issues

obs-studio has some UI issues, and dialog boxes are quite glitchy and don’t display properly. keyboard shortcuts can’t be reliably used when the obs-studio window is not in focus. can’t really bind to settings -> keyboard shortcuts either, since there’s no command to stop recording; start recording will spawn a new process.

Fontconfig

Custom font install using script. optional - ~/.config/fontconfig/fonts.conf change for system fonts that affect all apps. also gnome-tweaks to change fonts, and other settings.

Build with Naz video series on developerlife.com YouTube channel

If you have comments and feedback on this content, or would like to request new content (articles & videos) on developerlife.com, please join our discord server.

You can watch a video series on building this crate with Naz on the developerlife.com YouTube channel.

Build with Naz : Markdown parser in Rust and nom from r3bl_tui

2024-06-28T10:00:00-05:00

Introduction
nom crate review
A real production grade Markdown parser example
Related video
Architecture and parsing order
References

Introduction

This tutorial, and video are a deep dive in a real Markdown parser written using nom in Rust. This MD Parser is part of the r3bl_tui crate, which is part of the r3bl-open-core repo. It goes over the architecture of thinking about building complex parsers and the nitty gritty details the runtime nature and behavior when combining nom parsers.

The r3bl_tui crate is a Text User Interface (TUI) crate that is used in the R3BL suite of products. It is a very powerful and flexible TUI crate that is used to build a variety of different applications. It comes with a full featured Markdown editor component, and the parser that’s the focus on this tutorial is used by that editor component to parse an input string slice into a Markdown document model (AST representation in memory).

nom crate review

nom is a parser combinator library for Rust. You can write small functions that parse a specific part of your input, and then combine them to build a parser that parses the whole input. nom is very efficient and fast, it does not allocate memory when parsing if it doesn’t have to, and it makes it very easy for you to do the same. nom uses streaming mode or complete mode, and in this tutorial & code examples provided we will be using complete mode.

Roughly the way it works is that you tell nom how to parse a bunch of bytes in a way that matches some pattern that is valid for your data. It will try to parse as much as it can from the input, and the rest of the input will be returned to you.

You express the pattern that you’re looking for by combining parsers. nom has a whole bunch of these that come out of the box. And a huge part of learning nom is figuring out what these built in parsers are and how to combine them to build a parser that does what you want.

Errors are a key part of it being able to apply a variety of different parsers to the same input. If a parser fails, nom will return an error, and the rest of the input will be returned to you. This allows you to combine parsers in a way that you can try to parse a bunch of different things, and if one of them fails, you can try the next one. This is very useful when you are trying to parse a bunch of different things, and you don’t know which one you are going to get.

We have a video and article on developerlife where you can learn more about nom and how to use it.

Video on nom fundamentals.

Article on nom fundamentals.

A real production grade Markdown parser example

The production md_parser module in the r3bl-open-core repo contains a fully functional Markdown parser (that you can use in your projects that need a Markdown parser). This parser supports standard Markdown syntax as well as some extensions that are added to make it work w/ R3BL products. It makes a great starting point to study how a relatively complex parser is written. There are lots of tests that you can follow along to understand what the code is doing.

💡 You can get the source code for the production Markdown parser used in r3bl_tui from the r3bl-open-core repo.

🌟 Please star this repo on github if you like it 🙏.

The main entry point (function) for this Markdown parsing module is parse_markdown().

It takes a string slice.
And returns a vector of MdBlocks.

Here are some entry points into the codebase.

The main function parse_markdown() that does the parsing of a string slice into a MdDocument. The tests are provided alongside the code itself. And you can follow along to see how other smaller parsers are used to build up this big one that parses the whole of the Markdown document.
The types module contain all the types that are used to represent the Markdown document model, such as MdDocument, MdBlock, MdLineFragment and all the other intermediate types & enums required for parsing.
All the parsers related to parsing metadata specific for R3BL applications which are not standard Markdown can be found in parse_metadata_kv and parse_metadata_kcsv.
All the parsers that are related to parsing the main “blocks” of Markdown, such as order lists, unordered lists, code blocks, text blocks, heading blocks, can be found block.
All the parsers that are related to parsing a single line of Markdown text, such as links, bold, italic, etc. can be found fragment.

If you like to consume content via video, then you can watch this video that covers the same content as this article, but in a live coding format.

💡 You can get the source code for the production Markdown parser used in r3bl_tui from the r3bl-open-core repo.

Architecture and parsing order

This diagram showcases the order in which the parsers are called and how they are composed together to parse a Markdown document.

priority ┌────────────────────────────────────────────────────────────────────┐
  high   │ parse_markdown() {           map to the correct                    │
    │    │   many0(                     ───────────────────►  MdBlock variant │
    │    │     parse_title_value()                              Title         │
    │    │     parse_tags_list()                                Tags          │
    │    │     parse_authors_list()                             Authors       │
    │    │     parse_date_value()                               Date          │
    │    │     parse_block_heading_opt_eol()                    Heading       │
    │    │     parse_block_smart_list()                         SmartList     │
    │    │     parse_block_code()                               CodeBlock     │
    │    │     parse_block_m..n_text_with_or_without_new_line() Text          │
    │    │   )                                                                │
    ▼    │ }                                                                  │
priority └────────────────────────────────────────────────────────────────────┘
  low

The parsing strategy in most cases is to parse the most specific thing first and then parse the more general thing later. We often use the existence of \n (or eol) to decide how far forwards we need to go into the input. And sometimes \n doesn’t exist and we simply use the entire input (or end of input or eoi). You might see functions that have these suffixes in their names. Another term you might see is with_or_without_new_line which makes the parsing strategy explicit in the name.

The nature of nom parsers is to simply error out when they don’t match. And leave the input untouched, so that another parser have a go at it again. The nature of these parsing functions is kind of recursive in nature. So it’s important identify edge and exit cases up front before diving into the parsing logic. You will see this used in parsers which look for something more specific, if its not found, they error out, and allow less specific parsers to have a go at it, and so on.

The priority of parsers

As we drill down into the implementation further, we see that the parsers are prioritized in the order of their specificity. The most specific parsers are called first and the least specific parsers are called last. This is done to ensure that the most specific parsers get a chance to parse the input first. And if they fail, then the less specific parsers get a chance to parse the input.

parse_block_markdown_text_with_or_without_new_line() {
  many0(
    parse_inline_fragments_until_eol_or_eoi()
       )   │
}          │                                           ──map to the correct──►
           └─► alt(                                     MdLineFragment variant

             ▲ p..e_f..t_s..s_with_underscore_err_on_new_line()  Italic
             │ p..e_f..t_s..s_with_star_err_on_new_line()        Bold
specialized  │ p..e_f..t_s..s_with_backtick_err_on_new_line()    InlineCode
parsers ────►│ p..e_f..t_s..s_with_left_image_err_on_new_line()  Image
             │ p..e_f..t_s..s_with_left_link_err_on_new_line()   Link
             │ p..e_f..t_s..s_with_checkbox_into_str()           Plain
             ▼ p..e_f..t_s..s_with_checkbox_checkbox_into_bool() Checkbox
catch all────► p..e_f..t_plain_text_no_new_line()                Plain
parser
               )

The last one on the list in the diagram above is parse_block_markdown_text_with_or_without_new_line(). Let’s zoom into this function and see how it is composed.

The “catch all” parser, which is the most complicated, and the lowest priority

The most complicated parser is the “catch all” parser or the “plain text” parser. This parser is the last one in the chain and it simply consumes the rest of the input and turns it into a MdBlock::Text. This parser is the most complicated because it has to deal with all the edge cases and exit cases that other parsers have not dealt with. Such as special characters like `, *, _, etc. They are all listed here:

If the input does not start with a special char in this get_sp_char_set_2(), then this is the “Normal case”. In this case the input is split at the first occurrence of a special char in get_sp_char_set_3(). The “before” part is MdLineFragment::Plain and the “after” part is parsed again by a more specific parser.
If the input starts with a special char in this get_sp_char_set_2() and it is not in the get_sp_char_set_1() with only 1 occurrence, then the behavior is different “Edge case -> Normal case”. Otherwise the behavior is “Edge case -> Special case”.
- “Edge case -> Normal case” takes all the characters until \n or end of input and turns it into a MdLineFragment::Plain.
- “Edge case -> Special case” splits the input before and after the special char. The “before” part is turned into a MdLineFragment::Plain and the “after” part is parsed again by a more specific parser.

The reason this parser gets called repeatedly is because it is the last one in the chain. Its the lowest priority parser called by parse_inline_fragments_until_eol_or_eoi(), which itself is called:

Repeatedly in a loop by parse_block_markdown_text_with_or_without_new_line().
And by parse_block_markdown_text_with_checkbox_policy_with_or_without_new_line().

Visualize the parsers running on real input

Let’s run some tests from the md_parser module with the DEBUG_MD_PARSER_STDOUT flag set to true.

This will allow us to see the output of the parsers as they run on real input. This is a great way to understand how the parsers are working and what they are doing. This helps build an intuition around what happens at runtime which might not match what you think is happening when you read the code.

The test we will run are in this file: tui/src/tui/md_parser/block/parse_block_markdown_text_until_eol_or_eoi.rs.
The test suite itself is called tests_parse_block_markdown_text_with_or_without_new_line.
And the function under test is parse_block_markdown_text_with_or_without_new_line().

For convenience, here’s a copy of the test that we will run (in this file):

#[test]
fn test_parse_hyperlink_markdown_text_1() {
    let input = "This is a _hyperlink: [foo](http://google.com).";
    let it = parse_block_markdown_text_with_or_without_new_line(input);
    assert_eq2!(
        it,
        Ok((
            "",
            list![
                MdLineFragment::Plain("This is a ",),
                MdLineFragment::Plain("_",),
                MdLineFragment::Plain("hyperlink: ",),
                MdLineFragment::Link(HyperlinkData {
                    text: "foo",
                    url: "http://google.com",
                },),
                MdLineFragment::Plain(".",),
            ],
        ))
    );
}

You can see from the assert_eq2!() statements that the input "This is a _hyperlink: [foo](http://google.com)." is turned into a abstract syntax tree (AST) which looks like this:

[
    MdLineFragment::Plain("This is a ",),
    MdLineFragment::Plain("_",),
    MdLineFragment::Plain("hyperlink: ",),
    MdLineFragment::Link(HyperlinkData {
        text: "foo",
        url: "http://google.com",
    },),
    MdLineFragment::Plain(".",),
]

Note the “strange” way in which "_" is handled. Instead of what we might expect Plain("This is a _ hyperlink: "). But we get 3 fragments instead of one. This is because of the lowest priority parser handles special characters so that more specific parsers (higher priority) can have a go at it. So it doesn’t prematurely mark them as Plain.

Here are the commands to run one of the tests (make sure to run this in the tui subfolder):

cargo test -- --nocapture test_parse_hyperlink_markdown_text_1

Here’s the output, which you can walk through to see the parsing algorithms in action:

■■ specialized parser _:
input: "This is a _hyperlink: [foo](http://google.com).", delim: "_"
count: 1, starts_w: false, input=delim: false
⬢⬢ parser error out for input: "This is a _hyperlink: [foo](http://google.com)."

■■ specialized parser *:
input: "This is a _hyperlink: [foo](http://google.com).", delim: "*"
count: 0, starts_w: false, input=delim: false
⬢⬢ parser error out for input: "This is a _hyperlink: [foo](http://google.com)."

■■ specialized parser `:
input: "This is a _hyperlink: [foo](http://google.com).", delim: "`"
count: 0, starts_w: false, input=delim: false
⬢⬢ parser error out for input: "This is a _hyperlink: [foo](http://google.com)."

■■ specialized parser take text between delims err on new line:
input: "This is a _hyperlink: [foo](http://google.com).", start_delim: "![", end_delim: "]"
⬢⬢ parser error out for input: "This is a _hyperlink: [foo](http://google.com)."

⬢⬢ specialized parser error out with image:
input: "This is a _hyperlink: [foo](http://google.com).", delim: "!["

■■ specialized parser take text between delims err on new line:
input: "This is a _hyperlink: [foo](http://google.com).", start_delim: "[", end_delim: "]"
⬢⬢ parser error out for input: "This is a _hyperlink: [foo](http://google.com)."

⬢⬢ specialized parser error out with link:
input: "This is a _hyperlink: [foo](http://google.com).", delim: "["
⬢⬢ specialized parser for checkbox: Err(Error(Error { input: "This is a _hyperlink: [foo](http://google.com).", code: Tag }))

██ plain parser, input: "This is a _hyperlink: [foo](http://google.com)."
▲▲ normal case :: Ok(("_hyperlink: [foo](http://google.com).", "This is a "))

■■ specialized parser _:
input: "_hyperlink: [foo](http://google.com).", delim: "_"
count: 1, starts_w: true, input=delim: false
⬢⬢ parser error out for input: "_hyperlink: [foo](http://google.com)."

■■ specialized parser *:
input: "_hyperlink: [foo](http://google.com).", delim: "*"
count: 0, starts_w: false, input=delim: false
⬢⬢ parser error out for input: "_hyperlink: [foo](http://google.com)."

■■ specialized parser `:
input: "_hyperlink: [foo](http://google.com).", delim: "`"
count: 0, starts_w: false, input=delim: false
⬢⬢ parser error out for input: "_hyperlink: [foo](http://google.com)."

■■ specialized parser take text between delims err on new line:
input: "_hyperlink: [foo](http://google.com).", start_delim: "![", end_delim: "]"
⬢⬢ parser error out for input: "_hyperlink: [foo](http://google.com)."

⬢⬢ specialized parser error out with image:
input: "_hyperlink: [foo](http://google.com).", delim: "!["

■■ specialized parser take text between delims err on new line:
input: "_hyperlink: [foo](http://google.com).", start_delim: "[", end_delim: "]"
⬢⬢ parser error out for input: "_hyperlink: [foo](http://google.com)."

⬢⬢ specialized parser error out with link:
input: "_hyperlink: [foo](http://google.com).", delim: "["
⬢⬢ specialized parser for checkbox: Err(Error(Error { input: "_hyperlink: [foo](http://google.com).", code: Tag }))

██ plain parser, input: "_hyperlink: [foo](http://google.com)."
▲▲ edge case -> special case :: rem: "hyperlink: [foo](http://google.com).", output: "_"

■■ specialized parser _:
input: "hyperlink: [foo](http://google.com).", delim: "_"
count: 0, starts_w: false, input=delim: false
⬢⬢ parser error out for input: "hyperlink: [foo](http://google.com)."

■■ specialized parser *:
input: "hyperlink: [foo](http://google.com).", delim: "*"
count: 0, starts_w: false, input=delim: false
⬢⬢ parser error out for input: "hyperlink: [foo](http://google.com)."

■■ specialized parser `:
input: "hyperlink: [foo](http://google.com).", delim: "`"
count: 0, starts_w: false, input=delim: false
⬢⬢ parser error out for input: "hyperlink: [foo](http://google.com)."

■■ specialized parser take text between delims err on new line:
input: "hyperlink: [foo](http://google.com).", start_delim: "![", end_delim: "]"
⬢⬢ parser error out for input: "hyperlink: [foo](http://google.com)."

⬢⬢ specialized parser error out with image:
input: "hyperlink: [foo](http://google.com).", delim: "!["

■■ specialized parser take text between delims err on new line:
input: "hyperlink: [foo](http://google.com).", start_delim: "[", end_delim: "]"
⬢⬢ parser error out for input: "hyperlink: [foo](http://google.com)."

⬢⬢ specialized parser error out with link:
input: "hyperlink: [foo](http://google.com).", delim: "["
⬢⬢ specialized parser for checkbox: Err(Error(Error { input: "hyperlink: [foo](http://google.com).", code: Tag }))

██ plain parser, input: "hyperlink: [foo](http://google.com)."
▲▲ normal case :: Ok(("[foo](http://google.com).", "hyperlink: "))

■■ specialized parser _:
input: "[foo](http://google.com).", delim: "_"
count: 0, starts_w: false, input=delim: false
⬢⬢ parser error out for input: "[foo](http://google.com)."

■■ specialized parser *:
input: "[foo](http://google.com).", delim: "*"
count: 0, starts_w: false, input=delim: false
⬢⬢ parser error out for input: "[foo](http://google.com)."

■■ specialized parser `:
input: "[foo](http://google.com).", delim: "`"
count: 0, starts_w: false, input=delim: false
⬢⬢ parser error out for input: "[foo](http://google.com)."

■■ specialized parser take text between delims err on new line:
input: "[foo](http://google.com).", start_delim: "![", end_delim: "]"
⬢⬢ parser error out for input: "[foo](http://google.com)."

⬢⬢ specialized parser error out with image:
input: "[foo](http://google.com).", delim: "!["

■■ specialized parser take text between delims err on new line:
input: "[foo](http://google.com).", start_delim: "[", end_delim: "]"

■■ specialized parser take text between delims err on new line:
input: "(http://google.com).", start_delim: "(", end_delim: ")"
▲▲ specialized parser for link: Ok((".", HyperlinkData { text: "foo", url: "http://google.com" }))

■■ specialized parser _:
input: ".", delim: "_"
count: 0, starts_w: false, input=delim: false
⬢⬢ parser error out for input: "."

■■ specialized parser *:
input: ".", delim: "*"
count: 0, starts_w: false, input=delim: false
⬢⬢ parser error out for input: "."

■■ specialized parser `:
input: ".", delim: "`"
count: 0, starts_w: false, input=delim: false
⬢⬢ parser error out for input: "."

■■ specialized parser take text between delims err on new line:
input: ".", start_delim: "![", end_delim: "]"
⬢⬢ parser error out for input: "."

⬢⬢ specialized parser error out with image:
input: ".", delim: "!["

■■ specialized parser take text between delims err on new line:
input: ".", start_delim: "[", end_delim: "]"
⬢⬢ parser error out for input: "."

⬢⬢ specialized parser error out with link:
input: ".", delim: "["
⬢⬢ specialized parser for checkbox: Err(Error(Error { input: ".", code: Tag }))

██ plain parser, input: "."
▲▲ normal case :: Ok(("", "."))

■■ specialized parser _:
input: "", delim: "_"
count: 0, starts_w: false, input=delim: false
⬢⬢ parser error out for input: ""

■■ specialized parser *:
input: "", delim: "*"
count: 0, starts_w: false, input=delim: false
⬢⬢ parser error out for input: ""

■■ specialized parser `:
input: "", delim: "`"
count: 0, starts_w: false, input=delim: false
⬢⬢ parser error out for input: ""

■■ specialized parser take text between delims err on new line:
input: "", start_delim: "![", end_delim: "]"
⬢⬢ parser error out for input: ""

⬢⬢ specialized parser error out with image:
input: "", delim: "!["

■■ specialized parser take text between delims err on new line:
input: "", start_delim: "[", end_delim: "]"
⬢⬢ parser error out for input: ""

⬢⬢ specialized parser error out with link:
input: "", delim: "["
⬢⬢ specialized parser for checkbox: Err(Error(Error { input: "", code: Tag }))

██ plain parser, input: ""
⬢⬢ normal case :: Err(Error(Error { input: "", code: Eof }))

See this in action in r3bl-cmdr

If you want to use a TUI app that uses this Markdown Parser, run the following commands:

cargo install r3bl-cmdr
edi --help

This will install the r3bl-cmdr binary and run edi, which is a TUI Markdown editor that you can use on any OS (Mac, Windows, Linux).

References

nom is a huge topic. This tutorial takes a hands on approach to learning nom. However, the resources listed below are very useful for learning nom. Think of them as a reference guide and deep dive into how the nom library works.

Useful:
- Source code examples (fantastic way to learn nom):
  - export-logseq-notes repo
- Videos:
  - Intro from the author 7yrs old
  - nom 7 deep dive videos:
  - nom 6 videos (deep dive into how nom combinators themselves are constructed):
    - Deep dive, Part 1
    - Deep dive, Part 2
- Tutorials:
- Reference docs:
Less useful:
- README
- nom crate

Build with Naz video series on developerlife.com YouTube channel

If you have comments and feedback on this content, or would like to request new content (articles & videos) on developerlife.com, please join our discord server.

You can watch a video series on building this crate with Naz on the developerlife.com YouTube channel.

Build with Naz : Rust error handling with miette

2024-06-10T10:00:00-05:00

Introduction
Rust error handling primer
More resources on Rust error handling
YouTube video for this article
Examples of Rust error handling with miette
Build with Naz video series on developerlife.com YouTube channel

Introduction

miette is an excellent crate that can make error handling in Rust powerful, flexible, and easy to use. It provides a way to create custom error types, add context to errors, and display errors in a user-friendly way. In this article, video, and repo, we’ll explore how to use miette to improve error handling in your Rust applications.

Rust error handling primer

Rust has a powerful error handling system that is based on the Result and Option types. For this tutorial we will focus on the Result type, which is an enum that has two variants: Ok and Err. The Ok variant is used to represent a successful result, while the Err variant is used to represent an error.

The Error trait in Rust has to be implemented for types that can be used as errors. The Error trait has a method called source that returns a reference to the underlying cause of the error. This trait has two supertraits: Debug and Display. The Debug trait is used to format the error for debugging purposes (for the operator), while the Display trait is used to format the error for displaying to the user.

The ? operator can be used in order to propagate errors up the call stack. This operator is used to unwrap the Result type and provide the inner value of the Ok variant. Otherwise it returns from the function with the error, if it is the Err variant. This operator can only be used in functions that return a Result type. Here’s an example:

/// Fails and produces output:
/// ```text
/// Error: ParseIntError { kind: InvalidDigit }
/// ```
#[test]
fn test() -> Result<(), Box<dyn std::error::Error>> {
    fn return_error_result() -> Result<u32, std::num::ParseIntError> {
        "1.2".parse::<u32>()
    }

    fn run() -> Result<(), Box<dyn std::error::Error>> {
        // It is as if the `?` is turned into the following code.
        // let result = match result {
        //     Ok(value) => value,
        //     Err(err) => return Err(Box::new(err)),
        // }
        let result = return_error_result()?;

        // The following lines will never be executed, since the previous
        // line will return from the function with an error.
        println!("Result: {}", result);
        Ok(())
    }

    run()?;

    Ok(())
}

In the rest of the tutorial (and accompanying video), we will build upon this knowledge and introduce miette, a crate that can make error handling in Rust powerful, flexible, and easy to use. We will also learn more about the thiserror crate, which can be used to easily create custom error types in Rust.

More resources on Rust error handling

YouTube video for this article

This blog post has short examples on how to use miette to enhance Rust error handling. If you like to learn via video, please watch the companion video on the developerlife.com YouTube channel.

Examples of Rust error handling with miette

Let’s create some examples to illustrate how to use miette to enhance Rust error handling. You can run cargo new --lib error-miette to create a new library crate.

The code in the video and this tutorial are all in this GitHub repo.

Then add the following to the Cargo.toml file that’s generated. These pull in all the dependencies that we need for these examples.

[package]
name = "error-miette"
version = "0.1.0"
edition = "2021"

[dependencies]

# Pretty terminal output.
crossterm = "0.27.0"

# Error handling.
thiserror = "1.0.61"
miette = { version = "7.2.0", features = ["fancy"] }
pretty_assertions = "1.4.0"

Example 1: Simple miette usage

Then you can add the following code to the src/lib.rs file. You can note the following things in the code:

We define a custom error type called UnderlyingDatabaseError using the thiserror crate.
We define a function called return_error_result that returns a Result.
We write a test called test_into_diagnostic that demonstrates how to use miette to add context to errors and display them in a user-friendly way. The test also demonstrates how to use the wrap_err and context methods to add context to errors. And how they are displayed in the error report (in the inverse order in which they were added).
We also demonstrate how to use the into_diagnostic method to convert a Result into a miette::Result.

#[cfg(test)]
pub mod simple_miette_usage {
    use crossterm::style::Stylize;
    use miette::{Context, IntoDiagnostic};

    #[derive(Debug, thiserror::Error)]
    pub enum UnderlyingDatabaseError {
        #[error("database corrupted")]
        DatabaseCorrupted,
    }

    fn return_error_result() -> Result<u32, std::num::ParseIntError> {
        "1.2".parse::<u32>()
    }

    #[test]
    fn test_into_diagnostic() -> miette::Result<()> {
        let error_result: Result<u32, std::num::ParseIntError> =
            return_error_result();
        assert!(error_result.is_err());

        // The following line will return from this test.
        // let it: u32 = error_result.into_diagnostic()?;

        let new_miette_result: miette::Result<u32> = error_result
            .into_diagnostic()
            .context("🍍 foo bar baz")
            .wrap_err(miette::miette!("custom string error"))
            .wrap_err(std::io::ErrorKind::NotFound)
            .wrap_err(UnderlyingDatabaseError::DatabaseCorrupted)
            .wrap_err("🎃 this is additional context about the failure");

        assert!(new_miette_result.is_err());

        println!(
            "{}:\n{:?}\n",
            "debug output".blue().bold(),
            new_miette_result
        );

        if let Err(ref miette_report) = new_miette_result {
            println!(
                "{}:\n{:?}\n",
                "miette report".red().bold(),
                miette_report.to_string()
            );

            let mut iter = miette_report.chain();

            // First.
            pretty_assertions::assert_eq!(
                iter.next().unwrap().to_string(),
                "🎃 this is additional context about the failure"
                    .to_string()
            );

            // Second.
            pretty_assertions::assert_eq!(
                iter.next().unwrap().to_string(),
                "database corrupted".to_string()
            );

            // Third.
            pretty_assertions::assert_eq!(
                iter.next().unwrap().to_string(),
                "entity not found".to_string()
            );

            // Fourth.
            pretty_assertions::assert_eq!(
                iter.next().unwrap().to_string(),
                "custom string error".to_string()
            );

            // Fifth.
            pretty_assertions::assert_eq!(
                iter.next().unwrap().to_string(),
                "🍍 foo bar baz".to_string()
            );

            // Final.
            pretty_assertions::assert_eq!(
                iter.next().unwrap().to_string(),
                "invalid digit found in string".to_string()
            );
        }

        Ok(())
    }

    #[test]
    fn test_convert_report_into_error() ->
        std::result::Result<(), Box<dyn std::error::Error>> {
        let miette_result: miette::Result<u32> =
            return_error_result()
                .into_diagnostic()
                .wrap_err(miette::Report::msg(
                    "wrapper for the source parse int error",
                ));

        // let converted_result: Result> =
        //     miette_result.map_err(|report| report.into());

        let converted_result:
            std::result::Result<(), Box<dyn std::error::Error>> =
            match miette_result {
                Ok(_) => Ok(()),
                Err(miette_report) => {
                    let boxed_error: Box<dyn std::error::Error> =
                        miette_report.into();
                    Err(boxed_error)
                }
            };

        println!(
            "{}:\n{:?}\n",
            "debug output".blue().bold(),
            converted_result
        );

        assert!(converted_result.is_err());

        Ok(())
    }
}

Example 2: Complex miette usage

Next, we will add the following code to the src/lib.rs file. You can note the following things in the code:

We define a custom error type called KvStoreError using the thiserror crate.
We define two variants for the KvStoreError enum: CouldNotCreateDbFolder and CouldNotGetOrCreateEnvOrOpenStore. The latter variant has a field called source that is of type UnderlyingDatabaseError, which is defined in the previous example.
We define two functions called return_flat_err and return_nested_err that return miette::Result<(), KvStoreError>.
We write two tests called fails_with_flat_err and fails_with_nested_err that demonstrate how to use miette to add context to errors and display them in a user-friendly way. The tests also demonstrate how to use the from attribute to convert an error of one type into an error of another type.
We also demonstrate how to use the #[diagnostic] attribute to add a code and help URL to the error type.
We also demonstrate how to use the #[from] attribute to convert an error of one type into an error of another type.
We also demonstrate how to use the #[error] attribute to add a custom error message to the error type.

#[cfg(test)]
pub mod complex_miette_usage {
    use std::error::Error;

    use crate::simple_miette_usage::UnderlyingDatabaseError;
    use pretty_assertions::assert_eq;

    #[derive(thiserror::Error, Debug, miette::Diagnostic)]
    pub enum KvStoreError {
        #[diagnostic(
            code(MyErrorCode::FileSystemError),
            help("https://docs.rs/rkv/latest/rkv/enum.StoreError.html"),
            // url(docsrs) /* Works if this code was on crates.io / docs.rs */
        )]
        #[error("📂 Could not create db folder: '{db_folder_path}' on disk")]
        CouldNotCreateDbFolder { db_folder_path: String },

        #[diagnostic(
            code(MyErrorCode::StoreCreateOrAccessError),
            help("https://docs.rs/rkv/latest/rkv/enum.StoreError.html"),
            // url(docsrs) /* Works if this code was on crates.io / docs.rs */
        )]
        #[error("💾 Could not get or create environment, or open store")]
        CouldNotGetOrCreateEnvOrOpenStore {
            #[from]
            source: UnderlyingDatabaseError,
        },
    }

    fn return_flat_err() -> miette::Result<(), KvStoreError> {
        Result::Err(KvStoreError::CouldNotCreateDbFolder {
            db_folder_path: "some/path/to/db".to_string(),
        })
    }

    /// This test will not run! It will fail and demonstrate the default
    /// [report handler](miette::ReportHandler) of the `miette` crate.
    #[test]
    fn fails_with_flat_err() -> miette::Result<()> {
        let result = return_flat_err();

        if let Err(error) = &result {
            assert_eq!(
                format!("{:?}", error),
                "CouldNotCreateDbFolder { db_folder_path: \"some/path/to/db\" }"
            );
        }

        result?;

        Ok(())
    }

    fn return_nested_err() -> miette::Result<(), KvStoreError> {
        // Variant 1 - Very verbose.
        let store_error = UnderlyingDatabaseError::DatabaseCorrupted;
        let rkv_error = KvStoreError::from(store_error);
        Result::Err(rkv_error)

        // Variant 2.
        // Result::Err(KvStoreError::CouldNotGetOrCreateEnvOrOpenStore {
        //     source: UnderlyingDatabaseError::DatabaseCorrupted,
        // })
    }

    /// This test will not run! It will fail and demonstrate the default
    /// [report handler](miette::ReportHandler) of the `miette` crate.
    #[test]
    fn fails_with_nested_err() -> miette::Result<()> {
        let result = return_nested_err();

        if let Err(error) = &result {
            assert_eq!(
                format!("{:?}", error),
                "CouldNotGetOrCreateEnvOrOpenStore { source: DatabaseCorrupted }"
            );
        }

        result?;

        Ok(())
    }
}

Parting thoughts

For more sophisticated error handling examples, please check out the following links:

Build with Naz video series on developerlife.com YouTube channel

If you have comments and feedback on this content, or would like to request new content (articles & videos) on developerlife.com, please join our discord server.

You can watch a video series on building this crate with Naz on the developerlife.com YouTube channel.

Build with Naz : Rust typestate pattern

2024-05-28T10:00:00-05:00

What is the typestate pattern?
More resources on typestate pattern and others in Rust
YouTube video for this article
Examples of typestate pattern in Rust
Build with Naz video series on developerlife.com YouTube channel

What is the typestate pattern?

The Typestate Pattern in Rust is a way to manage objects that go through different states in their lifecycle. It leverages Rust’s powerful type system to enforce these states and transitions between them, making your code safer and more predictable. Learn all about it in this article, its video, and repo.

Here are the key ideas behind the Typestate Pattern:

States as structs: Each possible state of the object is represented by a separate struct. This lets you associate specific methods and data with each state.
Transitions with ownership: Methods that transition the object to a new state consume the old state and return a value representing the new state. Rust’s ownership system ensures you can’t accidentally use the object in an invalid state.
Encapsulated functionality: Methods are only available on the structs representing the valid states. This prevents you from trying to perform actions that aren’t allowed in the current state.

Benefits of using the Typestate Pattern:

Safer code: By statically checking types at compile time, the compiler prevents you from accidentally using the object in an invalid state. This leads to fewer runtime errors and more robust code.
Improved readability: The code becomes more self-documenting because the valid state transitions are encoded in the types themselves.
Clearer APIs: By separating functionality based on state, APIs become more intuitive and easier to understand.

More resources on typestate pattern and others in Rust

YouTube video for this article

This blog post has short examples on how to use the typestate pattern in Rust. If you like to learn via video, please watch the companion video on the developerlife.com YouTube channel.

Examples of typestate pattern in Rust

Let’s create some examples to illustrate how to use the typestate pattern in Rust. You can run cargo new --bin typestate-pattern to create a new binary crate.

The code in the video and this tutorial are all in this GitHub repo.

Then add the following to the Cargo.toml file that’s generated. These pull in all the dependencies that we need for these examples.

[package]
name = "typestate-pattern"
version = "0.1.0"
edition = "2021"

[[bin]]
name = "ex1"
path = "src/ex1.rs"

[[bin]]
name = "ex2"
path = "src/ex2.rs"

[[bin]]
name = "ex3"
path = "src/ex3.rs"

[[bin]]
name = "ex3_1"
path = "src/ex3_1.rs"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
crossterm = { version = "0.27.0", features = ["event-stream"] }

Example 1: Simple version of this is using enums to encapsulate states as variants

Then you can add the following code to the src/ex1.rs file.

#[derive(Debug)]
pub enum InputEvent {
    Keyboard((KeyPress, Option<Vec<Modifier>>)),
    Resize(Size),
    Mouse(MouseEvent),
}

#[derive(Debug)]
pub enum Modifier {
    Shift,
    Control,
    Alt,
}

#[derive(Debug)]
pub enum KeyPress {
    Char(char),
    Enter,
    Backspace,
    Delete,
    Left,
    Right,
    Up,
    Down,
    Home,
    End,
    PageUp,
    PageDown,
    Tab,
    F(u8),
}

#[derive(Debug)]
pub enum Size {
    Height(u16),
    Width(u16),
}

#[derive(Debug)]
pub enum MouseEvent {
    Press(MouseButton, u16, u16),
    Release(u16, u16),
    Hold(u16, u16),
}

#[derive(Debug)]
pub enum MouseButton {
    Left,
    Right,
    Middle,
}

impl InputEvent {
    pub fn pretty_print(&self) {
        let it = match self {
            InputEvent::Keyboard((keypress, modifiers)) => {
                let mut result = format!("{:?}", keypress);
                if let Some(modifiers) = modifiers {
                    result.push_str(&format!("{:?}", modifiers));
                }
                result
            }
            InputEvent::Resize(size) => format!("{:?}", size),
            InputEvent::Mouse(mouse_event) => format!("{:?}", mouse_event),
        };
        println!("{}", it);
    }
}

fn main() {
    let a_pressed = InputEvent::Keyboard((KeyPress::Char('a'), None));
    println!("{:?}", a_pressed);

    let ctrl_c_pressed = InputEvent::Keyboard(
        (KeyPress::Char('c'), Some(vec![Modifier::Control]))
    );
    println!("{:?}", ctrl_c_pressed);

    let enter_pressed = InputEvent::Keyboard((KeyPress::Enter, None));
    enter_pressed.pretty_print();

    let mouse_pressed = InputEvent::Mouse(
        MouseEvent::Press(MouseButton::Left, 10, 20)
    );
    mouse_pressed.pretty_print();
}

The code for this example is here. Here’s the code for the real InputEvent.

The main things to note about this code.

We have a bunch of enums that represent different types of input events.
We have a method on the InputEvent enum that pretty prints the event for all variants. We don’t have a way to restrict methods on a specific variant using this approach.

When you run this code (using cargo run --bin ex1), it should produce the following output:

$ cargo run --bin ex1
Keyboard((Char('a'), None))
Keyboard((Char('c'), Some([Control])))
Enter
Press(Left, 10, 20)

Example 2: Slightly more complex versions are where one type + data = another type

For this example, let’s add the following code to the src/ex2.rs file.

mod ex1;
use ex1::InputEvent;

#[derive(Debug)]
pub enum EditorEvent {
    InsertChar(char),
    InsertNewLine,
    Delete,
    Backspace,
    MoveCursorLeft,
    MoveCursorRight,
    MoveCursorUp,
    MoveCursorDown,
    Copy,
    Paste,
    Cut,
    Undo,
    Redo,
}

impl TryFrom<InputEvent> for EditorEvent {
    type Error = String;

    fn try_from(input_event: InputEvent) -> Result<Self, Self::Error> {
        match input_event {
            InputEvent::Keyboard((keypress, modifiers)) =>
                match (keypress, modifiers)
            {
                (ex1::KeyPress::Char(ch), None) => Ok(Self::InsertChar(ch)),
                (ex1::KeyPress::Char(_), Some(_)) => todo!(),
                (ex1::KeyPress::Enter, None) => Ok(Self::InsertNewLine),
                (ex1::KeyPress::Enter, Some(_)) => todo!(),
                (ex1::KeyPress::Backspace, None) => todo!(),
                (ex1::KeyPress::Backspace, Some(_)) => todo!(),
                (ex1::KeyPress::Delete, None) => todo!(),
                (ex1::KeyPress::Delete, Some(_)) => todo!(),
                (ex1::KeyPress::Left, None) => todo!(),
                (ex1::KeyPress::Left, Some(_)) => todo!(),
                (ex1::KeyPress::Right, None) => todo!(),
                (ex1::KeyPress::Right, Some(_)) => todo!(),
                (ex1::KeyPress::Up, None) => todo!(),
                (ex1::KeyPress::Up, Some(_)) => todo!(),
                (ex1::KeyPress::Down, None) => todo!(),
                (ex1::KeyPress::Down, Some(_)) => todo!(),
                (ex1::KeyPress::Home, None) => todo!(),
                (ex1::KeyPress::Home, Some(_)) => todo!(),
                (ex1::KeyPress::End, None) => todo!(),
                (ex1::KeyPress::End, Some(_)) => todo!(),
                (ex1::KeyPress::PageUp, None) => todo!(),
                (ex1::KeyPress::PageUp, Some(_)) => todo!(),
                (ex1::KeyPress::PageDown, None) => todo!(),
                (ex1::KeyPress::PageDown, Some(_)) => todo!(),
                (ex1::KeyPress::Tab, None) => todo!(),
                (ex1::KeyPress::Tab, Some(_)) => todo!(),
                (ex1::KeyPress::F(_), None) => todo!(),
                (ex1::KeyPress::F(_), Some(_)) => todo!(),
            },
            InputEvent::Resize(_) => todo!(),
            InputEvent::Mouse(_) => todo!(),
        }
    }
}

fn main() {
    let a_pressed = InputEvent::Keyboard((ex1::KeyPress::Char('a'), None));
    println!("{:?}", EditorEvent::try_from(a_pressed));

    let enter_pressed = InputEvent::Keyboard((ex1::KeyPress::Enter, None));
    println!("{:?}", EditorEvent::try_from(enter_pressed));
}

You can get the source code for this example here. Here’s the code for the real EditorEvent.

Here are some notes on this code:

We have a new enum called EditorEvent that represents different types of events that can happen in an editor.
We have a TryFrom implementation for InputEvent that converts an InputEvent into an EditorEvent. This is a way to restrict methods to specific variants of an enum by converting it into a totally different type.
We still don’t have a way to restrict methods to specific variants of the enum.

When you run this code (using cargo run --bin ex2), it should produce the following:

$ cargo run --bin ex2
Ok(InsertChar('a'))
Ok(InsertNewLine)

Example 3: Best of both worlds, using generics and struct / enum with a marker trait

Finally we have arrived at the typestate pattern in Rust. With this example:

You can now group all the states under a marker.
You can have methods that are specific to a variant.
You can specify methods that are common to all.
It’s like a very sophisticated builder pattern if you’re already familiar with that.

Add the following code to the src/ex3.rs file.

use self::type_state_builder::HttpResponse;
use crossterm::style::Stylize;

pub fn main() -> Result<(), String> {
    let response = HttpResponse::<()>::new();
    println!("{}", "Start state".red().bold().underlined());
    println!("response: {:#?}", response);
    println!(
        "response size: {}",
        response.get_size().to_string().blue().bold()
    );

    // Transition to HeaderAndBody state by calling `set_status_line`.
    let mut response = response.set_status_line(200, "OK");
    println!("response: {:#?}", response);

    // Status line is required.
    println!("{}", "HeaderAndBody state".red().bold().underlined());
    println!("response_code: {}", response.get_response_code());
    println!("response body: {:#?}", response.get_body());
    println!("response: {:#?}", response);
    println!(
        "response size: {}",
        response.get_size().to_string().blue().bold()
    );

    // Body and headers are optional.
    println!("{}", "HeaderAndBody state # 2".red().bold().underlined());
    response.add_header("Content-Type", "text/html");
    response.set_body("Hello World!");
    println!("response: {:#?}", response);
    println!(
        "response size: {}",
        response.get_size().to_string().blue().bold()
    );

    // Final state.
    println!("{}", "Final state".red().bold().underlined());
    let response = response.finish();
    println!("response_code: {}", response.get_response_code());
    println!("status_line: {}", response.get_status_line());
    println!("headers: {:#?}", response.get_headers());
    println!("body: {}", response.get_body());
    println!("response: {:#?}", response);
    println!(
        "response size: {}",
        response.get_size().to_string().blue().bold()
    );

    Ok(())
}

Note the API that we have built here:

You can’t call get_response_code or get_body until you’ve called set_status_line.
You can’t call add_header or set_body until you’ve called set_status_line.
You can’t call finish until you’ve called set_status_line.
We have 3 states: Start, HeaderAndBody, and Final. These are meant to be used as markers to restrict methods to specific states. Each is a struct with a marker trait. And it may or may not contain data / fields.
We have a HttpResponse struct that uses a generic type T: Marker to represent the state. This is a way to restrict methods to specific states.
We can transition between states by calling methods that consume the current state and return a new state. These methods are specific to the state they transition from. And they can be implemented via impl HttpResponse { ... } blocks, where T is the Start, HeaderAndBody, or Final state.
We can even implement methods that are valid for a non-existent state using impl HttpResponse<()> { ... }. This is the constructor.
In the Final state, the data becomes immutable.

Add the following code to desribe the different state structs.

pub mod state {
    #[derive(Debug, Clone, Default)]
    pub struct Start {}

    #[derive(Debug, Clone, Default)]
    pub struct HeaderAndBody {
        pub response_code: u8,
        pub status_line: String,
        pub headers: Option<Vec<(String, String)>>,
        pub body: Option<String>,
    }

    #[derive(Debug, Clone, Default)]
    pub struct Final {
        pub response_code: u8,
        pub status_line: String,
        pub headers: Vec<(String, String)>,
        pub body: String,
    }

    // The following marker trait is used to restrict the operations
    // that are available in each state. This isn't strictly necessary,
    // but it's a nice thing to use in a where clause to restrict types.
    pub trait Marker {}
    impl Marker for () {}
    impl Marker for Start {}
    impl Marker for HeaderAndBody {}
    impl Marker for Final {}
}

Here is the code for the HttpResponse struct.

pub mod type_state_builder {
    use super::state::{Final, HeaderAndBody, Marker, Start};

    #[derive(Debug, Clone, Default)]
    pub struct HttpResponse<S: Marker> {
        pub state: S,
    }

    // Operations that are available in all states.
    impl<S> HttpResponse<S>
    where
        S: Marker,
    {
        pub fn get_size(&self) -> String {
            let len = std::mem::size_of_val(self);
            format!("{} bytes", len)
        }
    }

    // Operations that are only valid in `()`.
    impl HttpResponse<()> {
        pub fn new() -> HttpResponse<Start> {
            HttpResponse { state: Start {} }
        }
    }

    // Operations that are only valid in `Start`.
    impl HttpResponse<Start> {
        pub fn set_status_line(
            self,
            response_code: u8,
            message: &str,
        ) -> HttpResponse<HeaderAndBody> {
            HttpResponse {
                state: HeaderAndBody {
                    response_code,
                    status_line: format!(
                        "HTTP/1.1 {} {}", response_code, message
                    ),
                    ..Default::default()
                },
            }
        }
    }

    // Operations that are only valid in `HeaderAndBodyState`.
    impl HttpResponse<HeaderAndBody> {
        // setter.
        pub fn add_header(&mut self, key: &str, value: &str) {
            if self.state.headers.is_none() {
                self.state.headers.replace(Vec::new());
            }
            if let Some(v) = self.state.headers.as_mut() {
                v.push((key.to_string(), value.to_string()))
            }
        }

        // getter.
        pub fn get_response_code(&self) -> u8 {
            self.state.response_code
        }

        // setter.
        pub fn set_body(&mut self, body: &str) {
            self.state.body.replace(body.to_string());
        }

        // getter.
        pub fn get_body(&self) -> Option<&str> {
            self.state.body.as_deref()
        }

        // transition to Final state.
        pub fn finish(mut self) -> HttpResponse<Final> {
            HttpResponse {
                state: Final {
                    response_code: self.state.response_code,
                    status_line: self.state.status_line.clone(),
                    headers: self.state.headers.take().unwrap_or_default(),
                    body: self.state.body.take().unwrap_or_default(),
                },
            }
        }
    }

    // Operations that are only valid in `Final`.
    impl HttpResponse<Final> {
        // getter.
        pub fn get_headers(&self) -> &Vec<(String, String)> {
            &self.state.headers
        }

        // getter.
        pub fn get_body(&self) -> &str {
            &self.state.body
        }

        // getter.
        pub fn get_response_code(&self) -> u8 {
            self.state.response_code
        }

        // getter.
        pub fn get_status_line(&self) -> &str {
            &self.state.status_line
        }
    }
}

When you run the code using cargo run --bin ex3, it should produce the following output.

$ cargo run --bin ex3
Start state
response: HttpResponse {
    state: Start,
}
response size: 0 bytes
response: HttpResponse {
    state: HeaderAndBody {
        response_code: 200,
        status_line: "HTTP/1.1 200 OK",
        headers: None,
        body: None,
    },
}
HeaderAndBody state
response_code: 200
response body: None
response: HttpResponse {
    state: HeaderAndBody {
        response_code: 200,
        status_line: "HTTP/1.1 200 OK",
        headers: None,
        body: None,
    },
}
response size: 80 bytes
HeaderAndBody state # 2
response: HttpResponse {
    state: HeaderAndBody {
        response_code: 200,
        status_line: "HTTP/1.1 200 OK",
        headers: Some(
            [
                (
                    "Content-Type",
                    "text/html",
                ),
            ],
        ),
        body: Some(
            "Hello World!",
        ),
    },
}
response size: 80 bytes
Final state
response_code: 200
status_line: HTTP/1.1 200 OK
headers: [
    (
        "Content-Type",
        "text/html",
    ),
]
body: Hello World!
response: HttpResponse {
    state: Final {
        response_code: 200,
        status_line: "HTTP/1.1 200 OK",
        headers: [
            (
                "Content-Type",
                "text/html",
            ),
        ],
        body: "Hello World!",
    },
}
response size: 80 bytes

Example 3.1: Using enum and PhantomData instead of struct

You can use enums instead of structs if you have shared data (inner) that you move with state transitions.
And you have to use PhantomData here.

Add the following code to the src/ex3_1.rs file.

use self::type_state_builder::HttpResponse;
use crossterm::style::Stylize;

pub fn main() -> Result<(), String> {
    let response = HttpResponse::<()>::new();
    println!("{}", "Start state".red().bold().underlined());
    println!("response: {:#?}", response);
    println!(
        "response size: {}",
        response.get_size().to_string().blue().bold()
    );

    // Status line is required.
    println!("{}", "HeaderAndBody state".red().bold().underlined());
    let mut response = response.set_status_line(200, "OK");
    println!("response_code: {}", response.get_response_code());
    println!("response body: {:#?}", response.get_body());
    println!("response: {:#?}", response);
    println!(
        "response size: {}",
        response.get_size().to_string().blue().bold()
    );

    // Body and headers are optional.
    println!("{}", "HeaderAndBody state # 2".red().bold().underlined());
    response.add_header("Content-Type", "text/html");
    response.set_body("Hello World!");
    println!("response: {:#?}", response);
    println!(
        "response size: {}",
        response.get_size().to_string().blue().bold()
    );

    // Final state.
    println!("{}", "Final state".red().bold().underlined());
    let response = response.finish();
    println!("response_code: {}", response.get_response_code());
    println!("status_line: {}", response.get_status_line());
    println!("headers: {:#?}", response.get_headers());
    println!("body: {:#?}", response.get_body());
    println!("response: {:#?}", response);
    println!(
        "response size: {}",
        response.get_size().to_string().blue().bold()
    );

    Ok(())
}

Note that this main function is the same as the one in the previous example.

The following code will be different. We are adding a new data module.

pub mod data {
    #[derive(Debug, Clone, Default)]
    pub struct HttpResponseData {
        pub response_code: u8,
        pub status_line: String,
        pub headers: Option<Vec<(String, String)>>,
        pub body: Option<String>,
    }
}

Here’s the new state module. Note the use of enums and PhantomData instead of structs.

pub mod state {
    #[derive(Debug, Clone)]
    pub enum Start {}

    #[derive(Debug, Clone)]
    pub enum HeaderAndBody {}

    #[derive(Debug, Clone)]
    pub struct Final {}

    // The following marker trait is used to restrict the operations
    // that are available in each state. This isn't strictly necessary,
    // but it's a nice thing to use in a where clause to restrict types.
    pub trait Marker {}
    impl Marker for () {}
    impl Marker for Start {}
    impl Marker for HeaderAndBody {}
    impl Marker for Final {}
}

Here is the changed code for the HttpResponse struct.

pub mod type_state_builder {
    use super::{
        data::HttpResponseData,
        state::{Final, HeaderAndBody, Marker, Start},
    };
    use std::marker::PhantomData;

    #[derive(Debug, Clone)]
    pub struct HttpResponse<S: Marker> {
        pub data: HttpResponseData,
        pub state: PhantomData<S>,
    }

    // Operations that are only valid in ().
    impl HttpResponse<()> {
        pub fn new() -> HttpResponse<Start> {
            HttpResponse {
                data: HttpResponseData::default(),
                state: PhantomData::<Start>,
            }
        }
    }

    // Operations that are only valid in Start.
    impl HttpResponse<Start> {
        // setter.
        pub fn set_status_line(
            self,
            response_code: u8,
            message: &str,
        ) -> HttpResponse<HeaderAndBody> {
            HttpResponse {
                data: {
                    let mut data = self.data;
                    data.response_code = response_code;
                    data.status_line = format!(
                        "HTTP/1.1 {} {}", response_code, message
                    );
                    data
                },
                state: PhantomData::<HeaderAndBody>,
            }
        }
    }

    // Operations that are only valid in HeaderAndBodyState.
    impl HttpResponse<HeaderAndBody> {
        // setter.
        pub fn add_header(&mut self, key: &str, value: &str) {
            let mut_data = &mut self.data;
            if mut_data.headers.is_none() {
                mut_data.headers.replace(Vec::new());
            }
            if let Some(headers) = mut_data.headers.as_mut() {
                headers.push((key.to_string(), value.to_string()))
            }
        }

        // getter.
        pub fn get_response_code(&self) -> u8 {
            self.data.response_code
        }

        // setter.
        pub fn set_body(&mut self, body: &str) {
            self.data.body.replace(body.to_string());
        }

        // getter.
        pub fn get_body(&self) -> Option<&str> {
            self.data.body.as_deref()
        }

        // transition to Final state.
        pub fn finish(self) -> HttpResponse<Final> {
            let mut data = self.data;
            HttpResponse {
                data: HttpResponseData {
                    response_code: data.response_code,
                    status_line: data.status_line.clone(),
                    headers: Some(data.headers.take().unwrap_or_default()),
                    body: Some(data.body.take().unwrap_or_default()),
                },
                state: PhantomData::<Final>,
            }
        }
    }

    // Operations that are only valid in FinalState.
    impl HttpResponse<Final> {
        pub fn get_headers(&self) -> &Option<Vec<(String, String)>> {
            &self.data.headers
        }

        pub fn get_body(&self) -> &Option<String> {
            &self.data.body
        }

        pub fn get_response_code(&self) -> u8 {
            self.data.response_code
        }

        pub fn get_status_line(&self) -> &str {
            &self.data.status_line
        }
    }

    // Operations that are available in all states.
    impl<S> HttpResponse<S>
    where
        S: Marker,
    {
        pub fn get_size(&self) -> String {
            let len = std::mem::size_of_val(self);
            format!("{} bytes", len)
        }
    }
}

Here’s the output when you run the code using cargo run --bin ex3_1.

$ cargo run --bin ex3_1
Start state
response: HttpResponse {
    data: HttpResponseData {
        response_code: 0,
        status_line: "",
        headers: None,
        body: None,
    },
    state: PhantomData,
}
response size: 80 bytes
HeaderAndBody state
response_code: 200
response body: None
response: HttpResponse {
    data: HttpResponseData {
        response_code: 200,
        status_line: "HTTP/1.1 200 OK",
        headers: None,
        body: None,
    },
    state: PhantomData,
}
response size: 80 bytes
HeaderAndBody state # 2
response: HttpResponse {
    data: HttpResponseData {
        response_code: 200,
        status_line: "HTTP/1.1 200 OK",
        headers: Some(
            [
                (
                    "Content-Type",
                    "text/html",
                ),
            ],
        ),
        body: Some(
            "Hello World!",
        ),
    },
    state: PhantomData,
}
response size: 80 bytes
Final state
response_code: 200
status_line: HTTP/1.1 200 OK
headers: Some(
    [
        (
            "Content-Type",
            "text/html",
        ),
    ],
)
body: Some(
    "Hello World!",
)
response: HttpResponse {
    data: HttpResponseData {
        response_code: 200,
        status_line: "HTTP/1.1 200 OK",
        headers: Some(
            [
                (
                    "Content-Type",
                    "text/html",
                ),
            ],
        ),
        body: Some(
            "Hello World!",
        ),
    },
    state: PhantomData,
}
response size: 80 bytes

Parting thoughts

To get an experiential understanding of the typestate pattern, you should try to build something using it. It’s a powerful pattern that can help you write more robust and predictable code. And it’s a great way to leverage Rust’s type system to enforce state transitions in your code. I encourage you to clone the repo and run the code to see how it works. And make changes to it to see if you can make it behave differently and use it in your own projects.

Build with Naz video series on developerlife.com YouTube channel

If you have comments and feedback on this content, or would like to request new content (articles & videos) on developerlife.com, please join our discord server.

You can watch a video series on building this crate with Naz on the developerlife.com YouTube channel.

Build with Naz : Linux io_uring and tokio-uring exploration with Rust

2024-05-25T10:00:00-05:00

What is Linux io_uring?
YouTube video for this article
Examples of using tokio-uring in Rust
Build with Naz video series on developerlife.com YouTube channel

What is Linux io_uring?

When using async Rust and tokio, you don’t get async file IO at the OS level. Here are links from the official docs that discourage using tokio for file IO:

This is because tokio uses the mio crate, which uses epoll on Linux. These are not the most efficient ways to do async IO on Linux. The most efficient way to do async IO on Linux is to use the io_uring syscall. This is a new syscall that was added to the Linux kernel in version 5.1. It is a more efficient way to do async IO on Linux, and is used by the tokio-uring crate. Here are some great links to learn more about io_uring:

In this article, we will explore how to use tokio-uring to do async file IO at the OS level, and how to use it to build a simple echo TCP server, for use with netcat.

YouTube video for this article

This blog post has short examples on how to use the tokio-uring crate. If you like to learn via video, please watch the companion video on the developerlife.com YouTube channel.

Examples of using tokio-uring in Rust

Let’s create some examples to illustrate how to use tokio-uring. You can run cargo new --bin tokio-uring to create a new binary crate.

The code in the video and this tutorial are all in this GitHub repo.

Then add the following to the Cargo.toml file that’s generated. These pull in all the dependencies that we need for these examples.

[package]
name = "tokio-uring"
version = "0.1.0"
edition = "2021"

[[bin]]
name = "readfile"
path = "src/readfile.rs"

[[bin]]
name = "socketserver"
path = "src/socketserver.rs"

[dependencies]
tokio-uring = "0.4.0"
tokio = { version = "1.37.0", features = ["full", "tracing"] }
tokio-util = "0.7.11"
tracing = "0.1.40"
tracing-subscriber = "0.3.18"

ctrlc = "3.4.4"
miette = { version = "7.2.0", features = ["fancy"] }

crossterm = { version = "0.27.0", features = ["event-stream"] }

r3bl_terminal_async = { version = "0.5.3" }

Example 1: Read a file using tokio-uring and async, non-blocking IO

Then you can add the following code to the src/readfile.rs file.

use crossterm::style::Stylize;
use miette::IntoDiagnostic;
use std::path::Path;

fn main() -> miette::Result<()> {
    tokio_uring::start(read_file("Cargo.toml"))?;
    Ok(())
}

async fn read_file(name: impl AsRef<Path>) -> miette::Result<()> {
    let file = tokio_uring::fs::File::open(name).await.into_diagnostic()?;

    let buf_move = vec![0; 4096];

    // Read some data, the buffer is passed by ownership and submitted
    // to the kernel. When the operation completes, we get the buffer
    // back.
    let (result, buf_from_kernel) = file.read_at(buf_move, 0).await;
    let bytes_read = result.into_diagnostic()?;

    println!(
        "{}",
        format!("Read {} bytes", bytes_read)
            .yellow()
            .underlined()
            .bold()
    );

    println!(
        "{}\n{}",
        "Data (bytes):".yellow().bold().underlined(),
        format!("{:?}", &buf_from_kernel[..bytes_read])
            .blue()
            .bold()
    );

    println!(
        "{}\n{}",
        "Data (string):".yellow().bold().underlined(),
        String::from_utf8_lossy(&buf_from_kernel[..bytes_read])
            .cyan()
            .bold()
    );

    Ok(())
}

The code for this example is here.

The main things to note about this code.

We use the tokio_uring::fs::File struct to open a file.
We use the read_at method to read from the file at a specific offset. The buffer is passed by ownership to the kernel, and when the operation completes, we get the buffer back. This is different than how it works with tokio and std.
We print out the bytes that were read from the file, and the string representation of those bytes.

When you run this code (using cargo run --bin readfile), it should produce the following output:

read file using tokio_uring: Cargo.toml
read 604 bytes from file
file contents: [package]
name = "tokio-uring"
version = "0.1.0"
edition = "2021"

[[bin]]
name = "readfile"
path = "src/readfile.rs"

[[bin]]
name = "socketserver"
path = "src/socketserver.rs"

[dependencies]
tokio-uring = "0.4.0"
tokio = { version = "1.37.0", features = ["full", "tracing"] }
tokio-util = "0.7.11"
tracing = "0.1.40"
tracing-subscriber = "0.3.18"

ctrlc = "3.4.4"

miette = { version = "7.2.0", features = ["fancy"] }

crossterm = { version = "0.27.0", features = ["event-stream"] }

r3bl_terminal_async = { version = "0.5.3" }
# r3bl_terminal_async = { path = "../../r3bl-open-core/terminal_async" }

Example 2: Building a TCP echo server using tokio-uring that also uses tokio

For this example, let’s add the following code to the src/socketserver.rs file.

This will simply add the required imports to tokio_uring for TcpListener and TcpStream.
And we will also configure the tracing_subscriber to use the formatted subscriber, so that we get pretty printed log output to stdout and we have information about what thread generated that log event.
We use the tokio_uring::start function to spawn the runtime. This runtime isn’t the same as the one that we get from using #[tokio::main] and later in this example, we will see how we can handle both.

use crossterm::style::Stylize;
use miette::IntoDiagnostic;
use r3bl_terminal_async::port_availability;
use std::net::SocketAddr;
use tokio::task::AbortHandle;
use tokio_uring::{
    buf::IoBuf,
    net::{TcpListener, TcpStream},
};
use tokio_util::sync::CancellationToken;

/// Run `netcat localhost:8080` to test this server (once you run this main function).
fn main() -> miette::Result<()> {
    // Register tracing subscriber.
    tracing_subscriber::fmt()
        .without_time()
        .compact()
        .with_target(false)
        .with_line_number(false)
        .with_thread_ids(true)
        .with_thread_names(true)
        .init();

    let cancellation_token = CancellationToken::new();

    // TODO: Add ctrlc handler.

    // TODO: Add code to use the `tokio` runtime and run some futures on it.

    tokio_uring::start(start_server(cancellation_token))?;
}

You can get the source code for this example here.

Next, we will add the code to handle the server logic. The following code handles the incoming connections (using tokio_uring structs). This code is very similar to what we would write if we were using tokio directly.

The main difference is that we are checking for port availability before binding to the address, and we are using tokio_uring::spawn to spawn the futures, to handle incoming connections.
We will also use tokio::select! to create the main event loop. Since tokio_uring is in the same family as tokio, we can do that!
The port_availability module comes from r3bl_terminal_async crate, which is a dependency in the Cargo.toml file. It allows us to check whether a port is available or not, and find a free port in a given port range.

async fn start_server(cancellation_token: CancellationToken) -> miette::Result<()> {
    let tcp_listener = {
        let addr: SocketAddr = "0.0.0.0:8080".parse().into_diagnostic()?;
        // You can bind to the same address repeatedly, and it won't return
        // an error! Might have to check to see whether the port is open or
        // not before binding to it!
        match port_availability::check(addr).await? {
            port_availability::Status::Free => {
                tracing::info!("Port {} is available", addr.port());
            }
            port_availability::Status::Occupied => {
                tracing::info!(
                    "Port {} is NOT available, can't bind to it",
                    addr.port()
                );
                return Err(miette::miette!(
                    "Port {} is NOT available, can't bind to it",
                    addr.port()
                ));
            }
        }
        TcpListener::bind(addr).into_diagnostic()?
    };

    tracing::info!("{}", "server - started".to_string().red().bold());

    let mut abort_handles: Vec<AbortHandle> = vec![];

    loop {
        tokio::select! {
            _ = cancellation_token.cancelled() => {
                abort_handles.iter().for_each(|handle| handle.abort());
                break;
            }
            it = tcp_listener.accept() => {
                let (tcp_stream, _addr) = it.into_diagnostic()?;
                let join_handle = tokio_uring::spawn(
                    handle_connection(tcp_stream)
                );
                abort_handles.push(join_handle.abort_handle());
            }
        }
    }

    tracing::info!("{}", "server - stopped".to_string().red().bold());
    Ok(())
}

Add the following code to handle the echo logic. This code reads from the stream using tokio_uring and its function signature is quite different from what we would write if we were using tokio directly. It is similar to what happens with read_at in the previous example, and it moves ownership to read. Which returns a tuple:

Result containing the number of bytes read.
Buffer that was passed from the kernel.

The write_all function also returns a tuple that is similar.

async fn handle_connection(stream: TcpStream) -> miette::Result<()> {
    tracing::info!("handle_connection - start");

    let mut total_bytes_read = 0;
    let mut buf = vec![0u8; 10];

    loop {
        // Read from the stream.
        // Read some data, the buffer is passed by ownership and submitted
        // to the kernel. When the operation completes, we get the buffer
        // back.
        let (result_num_bytes_read, return_buf) = stream.read(buf).await;
        buf = return_buf;
        let num_bytes_read = result_num_bytes_read.into_diagnostic()?;

        // Check for EOF.
        if num_bytes_read == 0 {
            break;
        }

        // Write to the stream.
        let (result_num_bytes_written, slice) =
            stream.write_all(buf.slice(..num_bytes_read)).await;
        result_num_bytes_written.into_diagnostic()?; // Make sure no errors.

        // Update the buffer.
        buf = slice.into_inner();
        total_bytes_read += num_bytes_read;

        tracing::info!(
            "{}: {}",
            "handle_connection - num_bytes_read".to_string().red(),
            num_bytes_read
        );
    }

    tracing::info!(
        "handle_connection - end, total_bytes_read: {}",
        total_bytes_read
    );
    Ok(())
}

To test this, you can run the server using cargo run --bin socketserver. Then you can connect to the server using netcat (or nc) by running netcat localhost 8080. You can type some text and hit enter, and you should see the text echoed back to you.

This is what the output from netcat might look like:

netcat localhost 8080
echo echo echo
echo echo echo

This is what the output from the server might look like:

 cargo run --bin socketserver
 INFO main ThreadId(01) Port is available
 INFO main ThreadId(01) server - started - 0.0.0.0:8080
 INFO main ThreadId(01) handle_connection - start
 INFO main ThreadId(01) handle_connection - num_bytes_read: 10
 INFO main ThreadId(01) handle_connection - num_bytes_read: 5
 INFO main ThreadId(01) handle_connection - end, total bytes read : 15 bytes

There are two more bonus rounds that we can add to this example:

Add a ctrlc handler to gracefully shutdown the server, when the user types Ctrl+C.
Add code to use the tokio runtime and run some futures on it.

In the socketserver.rs file, you can add the following code to replace the comment //TODO: Add ctrlc handler.. The following code will add a ctrlc handler to gracefully shutdown the server, by cancelling the cancellation_token.

let cancellation_token_clone = cancellation_token.clone();
ctrlc::set_handler(move || {
    tracing::info!("Received Ctrl+C!");
    cancellation_token_clone.cancel();
})
.into_diagnostic()?;

And finally, the following code will replace the comment // TODO: Add code to use the `tokio` runtime and run some futures on it.. This code will spawn a new OS thread (using std) and then create a new multi-threaded tokio runtime on that thread. We will then run some futures on that runtime by passing an async block to the block_on function of the runtime.

// Can't use #[tokio::main] for `main()`, so we have to use the
// `tokio::runtime::Builder` API. However, we have to launch this in a separate
// thread, because we don't want it to collide with the `tokio_uring::start()`
// call.
let cancellation_token_clone = cancellation_token.clone();
std::thread::spawn(move || {
    // If you use `Builder::new_current_thread()`, the runtime will
    // use the single / current thread scheduler.
    // `Builder::new_multi_thread()` will use a thread pool.
    tokio::runtime::Builder::new_multi_thread()
        .enable_all()
        .worker_threads(4)
        .build()
        .into_diagnostic()
        .unwrap()
        .block_on(async_main(cancellation_token_clone))
});

Here’s the async_main function that we are calling in the code above. This function simply runs some futures on the tokio runtime that we created in the code above. You can see from the log output that the tasks are run in parallel (sometimes on the same thread and sometimes on different threads), and are scheduled in a non-deterministic order.

async fn async_main(cancellation_token: CancellationToken) {
    tracing::info!("{}", "async_main - start".to_string().magenta().bold());

    let mut interval =
        tokio::time::interval(std::time::Duration::from_millis(2_500));

    loop {
        tokio::select! {
            _ = interval.tick() => {
                tracing::info!(
                    "{}",
                    "async_main - tick".to_string().magenta().bold()
                    );

                // Notice in the output, that these tasks are NOT spawned
                // in the same order repeatedly. They are run in parallel
                // on different threads. And these are scheduled in a
                // non-deterministic order.
                let task_1 = tokio::spawn(async {
                    tokio::time::sleep(
                        std::time::Duration::from_millis(10)
                    ).await;
                    tracing::info!("async_main - tick {} - spawn", "#1"
                        .to_string().on_green().black().bold()
                    );
                });
                let task_2 = tokio::spawn(async {
                    tokio::time::sleep(
                        std::time::Duration::from_millis(10)
                    ).await;
                    tracing::info!("async_main - tick {} - spawn", "#2"
                        .to_string().on_red().black().bold()
                    );
                });
                let task_3 = tokio::spawn(async {
                    tokio::time::sleep(
                        std::time::Duration::from_millis(10)
                    ).await;
                    tracing::info!("async_main - tick {} - spawn", "#3"
                        .to_string().on_blue().black().bold()
                    );
                });
                let _ = tokio::join!(task_1, task_2, task_3);
            }
            _ = cancellation_token.cancelled() => {
                tracing::info!("async_main - cancelled");
                break;
            }
        }
    }

    tracing::info!("{}", "async_main - end".to_string().magenta().bold());
}

Here’s what the output from the server might look like, after adding the ctrlc handler and the tokio runtime code and running it for about 10 seconds.

 cargo run --bin socketserver
 INFO main ThreadId(01) Port is available
 INFO ThreadId(03) async_main - start
 INFO main ThreadId(01) server - started - 0.0.0.0:8080
 INFO ThreadId(03) async_main - tick
 INFO tokio-runtime-worker ThreadId(04) async_main - tick #3 - spawn
 INFO tokio-runtime-worker ThreadId(06) async_main - tick #2 - spawn
 INFO tokio-runtime-worker ThreadId(05) async_main - tick #1 - spawn
 INFO ThreadId(03) async_main - tick
 INFO tokio-runtime-worker ThreadId(06) async_main - tick #2 - spawn
 INFO tokio-runtime-worker ThreadId(05) async_main - tick #1 - spawn
 INFO tokio-runtime-worker ThreadId(04) async_main - tick #3 - spawn
 INFO ThreadId(03) async_main - tick
 INFO tokio-runtime-worker ThreadId(06) async_main - tick #3 - spawn
 INFO tokio-runtime-worker ThreadId(05) async_main - tick #1 - spawn
 INFO tokio-runtime-worker ThreadId(04) async_main - tick #2 - spawn
 INFO ThreadId(03) async_main - tick

Parting thoughts

There are areas of improvement in this codebase, such as port binding issues, and connection management issues.

If you run more than one instance of the process cargo run --bin startserver then the log output is pretty strange. The 2nd process that’s started seems to trigger the handle_connection function of the first process.
When you run the server and connect a client to it using netcat, and then kill the server process, using Ctrl+C, the client doesn’t drop the connection.

If you can figure out how to fix these issues, please raise a PR on the GitHub repo. I’d love to see how you solve these problems!

Build with Naz video series on developerlife.com YouTube channel

If you have comments and feedback on this content, or would like to request new content (articles & videos) on developerlife.com, please join our discord server.

You can watch a video series on building this crate with Naz on the developerlife.com YouTube channel.

Build with Naz : Rust async, non-blocking, concurrent, parallel, event loops, graceful shutdown

2024-05-19T10:00:00-05:00

Introduction
What is async Rust? Sequential vs concurrent code & parallelism as a resource
What async Rust is not
YouTube video for this article
Effective async Rust patterns by example
Build with Naz video series on developerlife.com YouTube channel

Introduction

In this article, video, and repo learn effective async Rust using real world patterns that show up consistently when creating non blocking, async, event loops, using channels. Delve into implementing the Future trait and async executor manually. Also explore graceful shutdown, when not to use async, and how to think about testing async code.

What is async Rust? Sequential vs concurrent code & parallelism as a resource

In Rust, you can write sequential code, and concurrent code:

Sequential code can be run sequentially, or in parallel (using thread::spawn()).
Concurrent code can be run on a single thread or multiple threads.

Concurrency is a way to structure code into separate tasks. This does not define the resources on a machine that will be used to run or execute tasks.

Parallelism is a way to specify what resources (CPU cores, or threads) will be used on a machine’s operating system to run tasks.

These 2 concepts are not the same. They are related but not the same.

What async Rust is not

Generally speaking, using async Rust is not just a matter of attaching async as a prefix to a function, when you define it, and postfix .await when you call it. In fact, if you don’t have at least one .await in your async function body, then it might not need to be async. This article and video are a deep dive into what async code is, what Rust Futures are, along with what async Runtimes are. Along with some common patterns and anti-patterns when thinking in async Rust.

YouTube video for this article

This blog post only has short examples on how to use Rust async effectively. To see how these ideas can be used in production code, with real-world examples, please watch the following video on the developerlife.com YouTube channel.

Effective async Rust patterns by example

Let’s create some examples to illustrate how to use async Rust effectively. You can run cargo new --lib effective-async-rust to create a new library crate.

The code in the video and this tutorial are all in this GitHub repo: https://github.com/nazmulidris/rust-scratch/blob/main/async_con_par/

Then add the following to the Cargo.toml file that’s generated. These pull in all the dependencies that we need for these examples.

[package]
name = "effective-async-rust"
version = "0.1.0"
edition = "2021"

[dependencies]
crossterm = { version = "0.27.0", features = ["event-stream"] }
tokio = { version = "1.37.0", features = ["full", "tracing"] }
tracing = "0.1.40"
tracing-subscriber = "0.3.18"
futures = "0.3.30"
async-stream = "0.3.5"

Example 1: Build a timer future using Waker

Then you can add the following code to the src/lib.rs file.

#[cfg(test)]
pub mod build_a_timer_future_using_waker;

We will implement the Future trait manually, in this example. Typically any async code block is converted into a finite state machine which implements the Future trait. Progress on the future only occurs when it is polled by the runtime or executor (eg: Tokio).

When a future is polled and it is Ready then the future is complete.
If it is Pending then the future is not complete. And when it is ready (at some point in the future, due to some event like network IO available via epoll or io_uring), the runtime expects the future to wake up the, by calling wake() on the Waker that is passed to this future by the runtime, via the Context object.

Here are more details on this:

The code for this example is here.

Create a new file src/build_a_timer_future_using_waker.rs. In this file, we are going to:

Build a timer that wakes up a task after a certain amount of time, to explore how Waker works.
We’ll just spin up a new thread when the timer is created, sleep for the required time, and then signal the timer future when the time window has elapsed.

Add the following code to the file, to define a new struct that will implement the Future trait. This struct will have a SharedState struct that will contain the state of the future, and an optional Waker that will be used to wake up the future when the timer has elapsed. This Waker is not available until the very first time the future is polled by the runtime.

#[derive(Default)]
pub struct TimerFuture {
    pub shared_state: Arc<Mutex<SharedState>>,
}

#[derive(Default)]
pub struct SharedState {
    pub completed: bool,
    pub waker: Option<Waker>,
}

Add the following code to implement the Future trait for the TimerFuture struct.

This code will be used to poll the future, by the runtime, and check if the timer has elapsed.
If it has, then the future is complete, and the runtime can move on to the next task. If the timer has not elapsed, then the future is not complete, and the runtime won’t do anything further with this future. And will go on to the next task (top level Future) that it can make progress on.

Something has to wake up this future to let the runtime know that the timer has elapsed, and that it needs to call poll() again on this Future. This is where the Waker comes in.

The first time poll() is called on this future, the runtime passes in a Waker and we save that to the SharedState struct.
This will be used by the timer thread to wake up the future, when the timer has elapsed (which we will do next).

impl Future for TimerFuture {
    type Output = ();

    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
        let mut shared_state = self.shared_state.lock().unwrap();
        match shared_state.completed {
            true => {
                eprintln!("{}", "TimerFuture is completed".to_string().green());
                Poll::Ready(())
            }
            false => {
                eprintln!("{}", "TimerFuture is not completed".to_string().red());
                // Importantly, we have to update the Waker every time the
                // future is polled because the future may have moved to
                // a different task with a different Waker. This will happen
                // when futures are passed around between tasks after being
                // polled.
                shared_state.waker = Some(cx.waker().clone());
                Poll::Pending
            }
        }
    }
}

Add the following code to create a new timer Future, and start a new thread that will sleep for the required time, and then wake up the Future when the timer has elapsed, by using the optional Waker that was saved in the SharedState struct (when poll() is called on the Future, by the runtime).

impl TimerFuture {
    pub fn new(duration: Duration) -> Self {
        let new_instance = TimerFuture::default();

        let shared_state_clone = new_instance.shared_state.clone();
        thread::spawn(move || {
            thread::sleep(duration);
            let mut shared_state = shared_state_clone.lock().unwrap();
            shared_state.completed = true;
            shared_state.waker.take().unwrap().wake();
        });

        new_instance
    }
}

Add the following test to run this code. The #[tokio::test] attribute macro generates code to start a single threaded executor to run the test code.

#[tokio::test]
async fn run_timer_future_with_tokio() {
    let timer_future = TimerFuture::new(Duration::from_millis(10));
    let shared_state = timer_future.shared_state.clone();
    assert!(!shared_state.lock().unwrap().completed);
    timer_future.await;
    assert!(shared_state.lock().unwrap().completed);
}

When you run this test, it should produce the following output:

running 1 test
TimerFuture is not completed
TimerFuture is completed
test build_a_timer_future_using_waker::run_timer_future_with_tokio ... ok

Example 2: Build an async runtime to run futures to completion

For this example, let’s add the following code to the src/lib.rs file.

#[cfg(test)]
pub mod build_an_executor_to_run_future;

In the example above, we use tokio to run the TimerFuture to completion. But in this example, we will implement our own simple async runtime.

This is a very simple runtime that will run futures to completion, by polling them until they are ready.
It should highlight how the Waker and Context are supplied by the runtime to the Future.

You can get the source code for this example here.

We will need a few things to implement this runtime:

Task struct that will contain the Future that needs to be run to completion.
Task queue that will contain all the tasks that need to be run. This will be a std::sync::mpsc::sync_channel.
Waker that will be used to wake up the runtime when a task is ready to be polled. Context that will be used to pass the Waker to the Future that is being polled.
Spawner struct that will be used to spawn new tasks into the runtime.
Executor struct that will be used to run the runtime.

Add the following code to the src/build_an_executor_to_run_future.rs file.

pub fn new_executor_and_spawner() -> (Executor, Spawner) {
    const MAX_TASKS: usize = 10_000;
    let (task_sender, task_receiver) = std::sync::mpsc::sync_channel(MAX_TASKS);
    (Executor { task_receiver }, Spawner { task_sender })
}

pub struct Executor {
    pub task_receiver: Receiver<Arc<Task>>,
}

pub struct Spawner {
    pub task_sender: SyncSender<Arc<Task>>,
}

pub struct Task {
    pub future: Mutex<Option<BoxFuture<'static, ()>>>,
    pub task_sender: SyncSender<Arc<Task>>,
}

Add the following code to the Spawner struct to spawn new tasks into the runtime.

impl Spawner {
    pub fn spawn(&self, future: impl Future<Output = ()> + 'static + Send) {
        let pinned_boxed_future = future.boxed();
        let task = Arc::new(Task {
            future: Mutex::new(Some(pinned_boxed_future)),
            task_sender: self.task_sender.clone(),
        });
        eprintln!(
            "{}",
            "sending task to executor, adding to channel"
                .to_string()
                .blue()
        );
        self.task_sender
            .send(task)
            .expect("too many tasks in channel");
    }
}

Add the following code to the Executor struct to run the runtime. This code will poll the task queue, and block until it can get a task to run. Once it has a task, which it has removed from the task channel or queue, it polls it (with the Context and Waker) to check whether it is ready.

If it is ready, then it is done.
If it is not ready, then it does not do anything further with it. When the task is ready to be polled (eg: when the duration has passed in the TimerFuture’s thread), it will use the Waker to wake up the task when it is ready to be polled). The ArcWake implementation for the Task struct is used for this; all it does is send the task back to the task channel, so that it can be polled again by the executor 🎉.
Here’s what a real world implementation of ArcWake might look like using something like Linux epoll or io_uring: https://rust-lang.github.io/async-book/02_execution/05_io.html.

impl ArcWake for Task {
    /// Implement `wake` by sending this task back onto the task
    /// channel so that it will be polled again by the executor,
    /// since it is now ready.
    fn wake_by_ref(arc_self: &Arc<Self>) {
        let cloned = arc_self.clone();
        arc_self
            .task_sender
            .send(cloned)
            .expect("too many tasks in channel");
        eprintln!(
            "{}",
            "task woken up, added back to channel"
                .to_string()
                .underlined()
                .green()
                .bold()
        );
    }
}

impl Executor {
    #[allow(clippy::while_let_loop)]
    pub fn run(&self) {
        // Remove task from receiver, or block if nothing available.
        loop {
            eprintln!("{}", "executor loop".to_string().red());
            // Remove the task from the receiver.
            // If it is pending, then the ArcWaker
            // will add it back to the channel.
            match self.task_receiver.recv() {
                Ok(arc_task) => {
                    eprintln!(
                        "{}",
                        "running task - start, got task from receiver"
                            .to_string()
                            .red()
                    );
                    let mut future_in_task = arc_task.future.lock().unwrap();
                    match future_in_task.take() {
                        Some(mut future) => {
                            let waker = waker_ref(&arc_task);
                            let context = &mut Context::from_waker(&waker);
                            let poll_result = future.as_mut().poll(context);
                            eprintln!(
                                "{}",
                                format!(
                                  "poll_result: {:?}", poll_result)
                                  .to_string().red()
                            );
                            if poll_result.is_pending() {
                                // We're not done processing the future, so put it
                                // back in its task to be run again in the future.
                                *future_in_task = Some(future);
                                eprintln!("{}",
                                  "putting task back in slot"
                                  .to_string().red()
                                );
                            } else {
                                eprintln!("{}", "task is done".to_string().red());
                            }
                        }
                        None => {
                            panic!("this never runs");
                        }
                    }
                    eprintln!("{}", "running task - end".to_string().red());
                }
                Err(_) => {
                    eprintln!("no more tasks to run, breaking out of loop");
                    break;
                }
            }
        }
    }
}

And finally, add this test to run this code. Notice this code does not use tokio to run the TimerFuture to completion. Instead, it uses the Executor and Spawner structs that we implemented above.

#[test]
fn run_executor_and_spawner() {
    use super::build_a_timer_future_using_waker::TimerFuture;

    let results = Arc::new(std::sync::Mutex::new(Vec::new()));

    let (executor, spawner) = new_executor_and_spawner();

    let results_clone = results.clone();
    spawner.spawn(async move {
        results_clone.lock().unwrap().push("hello, start timer!");
        TimerFuture::new(std::time::Duration::from_millis(10)).await;
        results_clone.lock().unwrap().push("bye, timer finished!");
    });

    drop(spawner);

    executor.run();

    assert_eq!(
        *results.lock().unwrap(),
        vec!["hello, start timer!", "bye, timer finished!"]
    );
}

This should produce the following output, which maps to the flow that we described above:

running 1 test
sending task to executor, adding to channel
executor loop
running task - start, got task from receiver
TimerFuture is not completed
poll_result: Pending
putting task back in slot
running task - end
executor loop
task woken up, added back to channel
running task - start, got task from receiver
TimerFuture is completed
poll_result: Ready(())
task is done
running task - end
executor loop
no more tasks to run, breaking out of loop
test build_an_executor_to_run_future::run_executor_and_spawner ... ok

Example 3: Running async code, concurrently, on a single thread

For this example, let’s add the following code to the src/lib.rs file.

#[cfg(test)]
pub mod local_set;

If you have async code, you can use a LocalSet to run the async code, in different tasks, on a single thread. This ensures that any data that you have to pass between these tasks can be !Send. Instead of wrapping the shared data in a Arc or Arc, you can just wrap it in an Rc.

In this example, we will explore how to run async code concurrently, on a single thread. This is an important concept to understand, as it is the basis for how async code can be run concurrently, using non-blocking event loops.

The code for this example is here.

Add the following code to the src/local_set.rs file.

It shows how you can create a Future that uses a Rc to share data concurrently, running on a single thread.
This is why the data is !Send, and we don’t need to use an Arc or Arc to share it between tasks.
Once the LocalSet is created, and local_spawn() is called, the task doesn’t actually run until local_set.run_until(..) is called, or local_set.await is called.

#[tokio::test]
async fn run_local_set_and_spawn_local() {
    // Can't send this data across threads (not wrapped in `Arc` or `Arc`).
    let non_send_data = Rc::new("!SEND DATA");
    let local_set = LocalSet::new();

    // Spawn a local task (bound to same thread) that uses the non-send data.
    let non_send_data_clone = non_send_data.clone();

    let async_block_1 = async move {
        println!(
            // https://doc.rust-lang.org/std/fmt/index.html#fillalignment
            "{:<7} {}",
            "start",
            non_send_data_clone.as_ref().yellow().bold(),
        );
    };
    // Does not run anything.
    let join_handle_1 = local_set.spawn_local(async_block_1);

    // This is required to run `async_block_1`.
    let _it = local_set.run_until(join_handle_1).await;

Add the following code to the src/local_set.rs file. This is just a different variant (from the first example) of creating a new async block, and running it using the LocalSet.

    // Create a 2nd async block.
    let non_send_data_clone = non_send_data.clone();
    let async_block_2 = async move {
        sleep(std::time::Duration::from_millis(100)).await;
        println!(
            // https://doc.rust-lang.org/std/fmt/index.html#fillalignment
            "{:<7} {}",
            "middle",
            non_send_data_clone.as_ref().green().bold()
        );
    };

    // This is required to run `async_block_2`.
    let _it = local_set.run_until(async_block_2).await;

Finally add the following code to the src/local_set.rs file. This yet another way of how you can create a new async block, and run it using the LocalSet. This one uses local_set.await which runs all the futures that are associated with the local_set.

    // Spawn another local task (bound to same thread) that uses
    // the non-send data.
    let non_send_data_clone = non_send_data.clone();
    let async_block_3 = async move {
        sleep(std::time::Duration::from_millis(100)).await;
        println!(
            // https://doc.rust-lang.org/std/fmt/index.html#fillalignment
            "{:<7} {}",
            "end",
            non_send_data_clone.as_ref().cyan().bold()
        );
    };
    // Does not run anything.
    let _join_handle_3 = local_set.spawn_local(async_block_3);

    // `async_block_3` won't run until this is called.
    local_set.await;
}

Here’s the output when you run this test:

running 1 test
start   !SEND DATA
middle  !SEND DATA
end     !SEND DATA
test local_set::run_local_set_and_spawn_local ... ok

Example 4: join!, select, spawn control flow constructors

For this example, let’s add the following code to the src/lib.rs file.

#[cfg(test)]
pub mod demo_join_select_spawn;

You can use join!, select!, and spawn to control the flow of async code. These are macros that are provided by the tokio crate. They are used to run multiple futures concurrent, in parallel, and wait for them to complete.

The code for this example is here.

Add the following code to the src/demo_join_select_spawn.rs file. This code shows how you can use join! to run multiple futures concurrently, and wait for them to complete.

pub async fn task_1(time: u64) {
    sleep(Duration::from_millis(time)).await;
    println!("task_1");
}

pub async fn task_2(time: u64) {
    sleep(Duration::from_millis(time)).await;
    println!("task_2");
}

pub async fn task_3(time: u64) {
    sleep(Duration::from_millis(time)).await;
    println!("task_3");
}

#[tokio::test]
async fn test_join() {
    tokio::join!(task_1(100), task_2(200), task_3(300));
    println!("all tasks done");
}

Here’s the output when you run this test:

running 1 test
task_1
task_2
task_3
all tasks done
test demo_join_select_spawn::test_join ... ok

Add the following code to the src/demo_join_select_spawn.rs file. This code shows how you can use select! to run multiple futures concurrently, and wait for the first one to complete.

#[tokio::test]
async fn test_select() {
    tokio::select! {
        _ = task_1(100) => println!("task_1 done"),
        _ = task_2(200) => println!("task_2 done"),
        _ = task_3(300) => println!("task_3 done"),
    }
    println!("one task done");
}

Here’s the output when you run this test:

running 1 test
task_1 done
one task done
test demo_join_select_spawn::test_select ... ok

Add the following code to the src/demo_join_select_spawn.rs file. This code shows how you can use spawn to run multiple futures in parallel, and wait for them to complete. We pass the following to the #[tokio::test] attribute macro: flavor = "multi_thread", worker_threads = 5 which tells it to run the test on multiple threads (max of 5).

#[tokio::test(flavor = "multi_thread", worker_threads = 5)]
async fn test_spawn() {
    let handle_1 = tokio::spawn(task_1(100));
    let handle_2 = tokio::spawn(task_2(100));
    let handle_3 = tokio::spawn(task_3(100));

    handle_1.await.unwrap();
    handle_2.await.unwrap();
    handle_3.await.unwrap();
    println!("all tasks done");
}

When you run this test, it should produce the following output (the ordering of the tasks which run first, second, and third, will vary):

running 1 test
task_3
task_1
task_2
all tasks done
test demo_join_select_spawn::test_spawn ... ok

Example 5: async streams

For this example, let’s add the following code to the src/lib.rs file.

#[cfg(test)]
pub mod async_stream;

You can use async streams to create a stream of values that are produced asynchronously. This is useful for testing, for example in the r3bl_terminal_async crate in readline.rs in test_streams module.

The code for this example is here.

Add the following code to the src/async_stream.rs file.

This code shows how you can use async_stream crate’s stream! macro to create a stream of values that are generated from a vector of strings.
This stream is then converted into a PinnedInputStream which is a Pin>>.

pub type PinnedInputStream = Pin<Box<dyn Stream<Item = Result<String, String>>>>;

pub fn gen_input_stream() -> PinnedInputStream {
    let it = async_stream::stream! {
        for event in get_input_vec() {
            yield Ok(event);
        }
    };
    Box::pin(it)
}

pub fn get_input_vec() -> Vec<String> {
    vec![
        "a".to_string(),
        "b".to_string(),
        "c".to_string(),
        "d".to_string(),
    ]
}

#[tokio::test]
async fn test_stream() {
    let mut count = 0;
    let mut it = gen_input_stream();
    while let Some(event) = it.next().await {
        let lhs = event.unwrap();
        let rhs = get_input_vec()[count].clone();
        assert_eq!(lhs, rhs);
        count += 1;
    }
}

Example 6: Non-blocking event loops, channel safety, and graceful shutdown

Let’s add the following code to the src/lib.rs file.

#[cfg(test)]
pub mod non_blocking_async_event_loops;

You can use non-blocking event loops to create a loop that runs async code, and waits for events to occur. This is useful for creating servers, clients, and other networked applications. You can even use the same pattern to create CLI and TUI applications that are non-blocking, and can handle multiple events concurrently, such as when you’re creating an interactive async REPL.

The source code for this example is here.

Add the following code to the src/non_blocking_async_event_loops.rs file.

#[tokio::test(flavor = "multi_thread", worker_threads = 5)]
async fn test_main_loop() {
    // Register tracing subscriber.
    tracing_subscriber::fmt()
        .without_time()
        .compact()
        .with_target(false)
        .with_line_number(false)
        .with_thread_ids(true)
        .with_thread_names(true)
        .init();

    // Create channels for events and shutdown signals.
    let event_channel = tokio::sync::mpsc::channel::<String>(1_000);
    let (event_sender, mut event_receiver) = event_channel;

    let shutdown_channel = tokio::sync::broadcast::channel::<()>(1_000);
    let (shutdown_sender, _) = shutdown_channel;

    // Spawn the main event loop.
    let mut shutdown_receiver = shutdown_sender.subscribe();
    let safe_count: std::sync::Arc<std::sync::Mutex<usize>> = Default::default();
    let safe_count_clone = safe_count.clone();
    let join_handle = tokio::spawn(async move {
        loop {
            tokio::select! {
                event = event_receiver.recv() => {
                    tracing::info!(?event, "task got event: event");
                    let mut count = safe_count_clone.lock().unwrap();
                    *count += 1;
                }
                _ = shutdown_receiver.recv() => {
                    tracing::info!("task got shutdown signal");
                    break;
                }
            }
        }
    });

    // Send events, in parallel.
    let mut handles = vec![];
    for i in 0..10 {
        let event_sender_clone = event_sender.clone();
        let join_handle = tokio::spawn(async move {
            tracing::info!(i, "sending event");
            let event = format!("event {}", i);
            let _ = event_sender_clone.send(event).await;
            tokio::time::sleep(std::time::Duration::from_millis(10)).await;
        });
        handles.push(join_handle);
    }

    // Wait for all events to be sent using tokio.
    futures::future::join_all(handles).await;

    // Shutdown the event loops.
    shutdown_sender.send(()).unwrap();

    // Wait for the event loop to shutdown.
    join_handle.await.unwrap();

    // Assertions.
    assert_eq!(shutdown_sender.receiver_count(), 1);
    assert_eq!(*safe_count.lock().unwrap(), 10);
}

Here are key points to note about this code:

We use tokio::sync::mpsc::channel to create a channel for events, and tokio::sync::broadcast::channel to create a channel for shutdown signals.
We spawn the main event loop, which listens for events and shutdown signals, and updates a shared counter.
We spawn multiple tasks that send events to the event channel, in parallel.
- The #[tokio::test(flavor = "multi_thread", worker_threads = 5)] attribute macro tells tokio to run the test on multiple threads (max of 5).
- You can see this in the output when you run the test. By configuring Tokio tracing subscriber, we can see the thread IDs and names in the output (.with_thread_ids(true), .with_thread_names(true)).
- We wait for all events to be sent using futures::future::join_all(handles).await.
We shutdown the event loop (using shutdown_sender.send(())), and wait for it to shutdown using join_handle.await..

When you run this test, it will produce the following output:

running 1 test
 INFO tokio-runtime-worker ThreadId(05) sending event i=2
 INFO tokio-runtime-worker ThreadId(04) sending event i=6
 INFO tokio-runtime-worker ThreadId(06) sending event i=0
 INFO tokio-runtime-worker ThreadId(07) sending event i=4
 INFO tokio-runtime-worker ThreadId(03) sending event i=7
 INFO tokio-runtime-worker ThreadId(04) sending event i=8
 INFO tokio-runtime-worker ThreadId(06) sending event i=1
 INFO tokio-runtime-worker ThreadId(05) task got event: event event=Some("event 2")
 INFO tokio-runtime-worker ThreadId(07) sending event i=5
 INFO tokio-runtime-worker ThreadId(03) sending event i=9
 INFO tokio-runtime-worker ThreadId(05) task got event: event event=Some("event 6")
 INFO tokio-runtime-worker ThreadId(04) sending event i=3
 INFO tokio-runtime-worker ThreadId(05) task got event: event event=Some("event 0")
 INFO tokio-runtime-worker ThreadId(05) task got event: event event=Some("event 4")
 INFO tokio-runtime-worker ThreadId(05) task got event: event event=Some("event 7")
 INFO tokio-runtime-worker ThreadId(05) task got event: event event=Some("event 8")
 INFO tokio-runtime-worker ThreadId(05) task got event: event event=Some("event 1")
 INFO tokio-runtime-worker ThreadId(05) task got event: event event=Some("event 5")
 INFO tokio-runtime-worker ThreadId(05) task got event: event event=Some("event 9")
 INFO tokio-runtime-worker ThreadId(05) task got event: event event=Some("event 3")
 INFO tokio-runtime-worker ThreadId(05) task got shutdown signal
test non_blocking_async_event_loops::test_main_loop ... ok

Interesting code links:

Testing async code: https://github.com/r3bl-org/r3bl-open-core/blob/main/terminal_async/src/readline_impl/readline.rs#L612
Using dependency injection and dealing with dyn T (trait objects): https://github.com/r3bl-org/r3bl-open-core/blob/main/terminal_async/src/readline_impl/readline.rs#L344.
Event loops and breaking out of them (lifecycle control mechanisms): https://github.com/nazmulidris/rust-scratch/blob/main/tcp-api-server/src/server_task.rs#L43 and https://github.com/nazmulidris/rust-scratch/blob/main/tcp-api-server/src/client_task.rs#L108.

Parting thoughts

Try not to use cancellation token: https://docs.rs/tokio-util/latest/tokio_util/sync/struct.CancellationToken.html, instead do this: https://github.com/nazmulidris/rust-scratch/pull/32 and https://github.com/nazmulidris/rust-scratch/commit/e129b0f681dd1eea1bcdd3372cd08a05081922ff
Do not use async or Tokio for underlying sync OS file copy: https://users.rust-lang.org/t/tokio-copy-slower-than-std-io-copy/111242.
Using the right Mutex in conjunction with Arc and holding them across await points from tokio docs.
Good videos:
- Async Rust: the good, the bad, and the ugly - Steve Klabnik.
- Nicholas Matsakis - Rust 2024 and beyond.

Build with Naz video series on developerlife.com YouTube channel

If you have comments and feedback on this content, or would like to request new content (articles & videos) on developerlife.com, please join our discord server.

You can watch a video series on building this crate with Naz on the developerlife.com YouTube channel.

Build with Naz : tokio tracing & OTel and how to use it in Rust

2024-05-15T10:00:00-05:00

Why use observability in async Rust?
Tokio tracing usage
Video of this in action in the real world
Short example to illustrate the use of tracing and OTel in Rust
Build with Naz video series on developerlife.com YouTube channel

Introduction

In this article, video, and repo learn how to use tokio tracing and OpenTelemetry (with Jaeger) in async Rust to instrument your code and collect telemetry data for observability.

Why use observability in async Rust?

In synchronous systems, it’s often easy to understand the flow of execution by looking at log messages. For example, if a thread walking through a single function ends up calling a whole host of other functions, and they all emit log messages, you can often piece together what happened by looking at the log messages in order.

However, in asynchronous systems, this is challenging. When using Tokio, for example, different threads might be executing the same task, as it goes from being parked, to being woken up, to being parked again. Both temporality (when a log event happened) and causality (what caused the event) get muddled. This is where observability comes in, provided by Tokio tracing crate and OpenTelemetry (OTel) crates.

The tracing crate expands upon logging-style diagnostics by allowing libraries and applications to record structured events with additional information about temporality and causality. Unlike a log message, a Span in tracing has a beginning and end time, may be entered and exited by the flow of execution, and may exist within a nested tree of similar spans.

For representing things that occur at a single moment in time, tracing provides the complementary concept of events. Both Spans and Events are structured, with the ability to record typed data as well as textual messages.

Tokio tracing usage

Code:

Here’s an example of using the tracing crate. Some key symbols to note are::

#[instrument] attribute is used to create a span.
Span::current().record() is used to add fields to the span (when the function is running, and this information is not known statically beforehand).
info!, error!, etc are used to emit log or tracing events. However, these are not used to create spans; they are used to emit events within a span.
#[tokio::main] is used to run the async main function.

use tracing::{info, instrument, Span};

#[tokio::main]
async fn main() {
    // Set up the tracing subscriber, so you can see the output of log events in stdout.
    // https://docs.rs/tracing-subscriber/latest/tracing_subscriber/fmt/fn.fmt.html
    tracing_subscriber::fmt()
        .with_test_writer()
        .with_env_filter("info")
        .init();

    // Call the entry point function.
    client_task::entry_point(1234).await;
}

mod client_task {
    #[instrument(name = "caller", skip_all, fields(?client_id))]
    pub async fn entry_point(client_id: usize) {
        info!("entry point");
        more_context("bar").await;
        handle_message(client_id, "foo").await;
        no_instrument("baz").await;
    }

    #[instrument(name = "callee", skip_all, fields(%message))]
    pub async fn handle_message(client_id: usize, message: String) {
        info!("handling message");
    }

    #[instrument(fields(extra))]
    pub async fn more_context(extra: &str) {
        CurrentSpan::current().record("extra", &extra);
        info!("more context");
    }

    pub async fn no_instrument(arg: &str) {
        info!("no instrument fn");
    }

}

Here are some key points to remember when using tracing from the code above:

You have to be careful about recording the same field multiple times, in an async call chain. In the example above, client_task::entry_point() is the entry point, and is the only function that should log the ?client_id; ? means debug. And not any other functions that it calls, like handle_message().
When you call entry_point(), it will call handle_message(), and the span that is generated by handle_message() will have the client_id field added to it, because of the call chain. So the output of info!("handling message") will have the client_id included in it (for free). It will also have the %message field in it; % means display. You don’t have to explicitly add either of these fields to the info!() call 🎉.
If you use the client_id field in multiple #[instrument..] attributes in functions (that are in the call chain), then this will show up multiple times in the log output (when using info!, debug!, etc) of the leaf function in the call chain. So when you see the same fields showing up multiple times in the output from info!, debug!, etc, then you know that you have to remove that field from the #[instrument..] attribute somewhere in the call chain (that the span covers).
You have to be careful about how to use [#instrument] attribute with tracing::Span::record. You have to call tracing::Span::current().record("foo","bar") in the same function where the #[instrument(fields(foo))] attribute is used.
When a function is called that isn’t instrumented, by another one, which is, any log events generated in the un-instrumented function will be associated with the span of the instrumented function. In the no_instrument function’s log output, you will see addition context from the entry_point function that looks something like INFO caller{client_id=1234}: no instrument fn.

Here are some helpful links to learn more about this topic:

Video of this in action in the real world

This blog post only has a short example to illustrate how to use Rust tracing and OTel with Jaeger. To see how these ideas can be used in production code, with real-world examples, please watch the following video on the developerlife.com YouTube channel.

Here’s the code for this real world example:

Repo for the tcp-api-server crate, which is an example of creating a TCP server and client that are observable using tracing and OTel.
- It uses the r3bl_terminal_async crate to allow async, non-blocking readline functionality, along with stdout and stderr that are also async.
- This crate is also used to configure Jaeger and tracing subscribers for file, stdout logging.
README for tcp-api-server crate, which shows how to use Jaeger, and configure file logging and stdout logging.
How to build up tracing subscribers using layers (type erasure, decl macros, etc).
How to add an OTel layer to a subscriber.
How the subscriber is configured with custom layers in r3bl_terminal_async crate.

Short example to illustrate the use of tracing and OTel in Rust

Let’s look a single example (that fits in one file) that illustrates the use of tracing in Rust. You can run cargo new --lib tracing-otel to create a new library crate, and then run the following:

cargo add miette --features fancy
cargo add tracing tracing-subscriber
cargo add tokio --features full

Then you can add the following code to the src/main.rs file.

use miette::IntoDiagnostic;
use tracing::Span;

#[tokio::main]
async fn main() -> miette::Result<()> {
    let subscriber = tracing_subscriber::fmt()
        .without_time()
        .pretty()
        .with_max_level(tracing::Level::DEBUG)
        .finish();

    tracing::subscriber::set_global_default(subscriber).into_diagnostic()?;

    print_message("foo").await;

    Ok(())
}

The first part of the code sets up the tracing subscriber. In this case we are using a formatting subscriber that prints logs to the console. This subscriber is configured to not print the time of the log message, to pretty print the logs, and to print logs at the DEBUG level or higher.

When you use #[attribute] along with info!, debug!, etc, Tokio will emit log events that are associated with a span. This is the “emitter” side of the process. The other side is the “subscriber” side, which is where the logs are actually printed to the console, or sent to a file, or sent to an OTel collector service like Jaeger (using OTLP protocol over gRPC).

Tokio tracing allows us to use this simple default subscriber, or create our own custom subscribers. It even allows a subscriber to be composed from layers. We can create our own custom layers, or use some default ones (like the level filter layer).

OTel is itself a tracing layer. In the video & tcp-api-server repo, you will see how to use OTel with Jaeger, and how to configure the OTel layer with a custom layer.

Next you can add the following code to the src/main.rs file.

#[tracing::instrument(fields(arg = ?arg, client_id), ret)]
async fn print_message(arg: &str) {
    tracing::info!("log message one");
    println!("{}", prepare_message().await);

    Span::current().record("client_id", 1234);

    tracing::warn!("log message two");
}

#[tracing::instrument(ret)]
async fn prepare_message() -> String {
    tracing::debug!("preparing message");
    let it = "Hello, world!".to_string();
    tracing::debug!("message prepared");
    it
}

The print_message function is annotated with the #[tracing::instrument] attribute. This attribute creates a span for the function, and adds the arg field to the span along with the client_id field. In all the log events are emitted within the span, the arg and client_id field will be included in the log output. This additional context is provided by a span. And you don’t have to write any code to the info!, warn!, etc calls to include these fields in the log output.

You can run the code using cargo run. The code will produce the following output.

   INFO tracing_otel: log message one
    at src/main.rs:38
    in tracing_otel::print_message with arg: "foo"

  DEBUG tracing_otel: preparing message
    at src/main.rs:48
    in tracing_otel::prepare_message
    in tracing_otel::print_message with arg: "foo"

  DEBUG tracing_otel: message prepared
    at src/main.rs:50
    in tracing_otel::prepare_message
    in tracing_otel::print_message with arg: "foo"

   INFO tracing_otel: return: "Hello, world!"
    at src/main.rs:46
    in tracing_otel::prepare_message
    in tracing_otel::print_message with arg: "foo"

Hello, world!

   WARN tracing_otel: log message two
    at src/main.rs:43
    in tracing_otel::print_message with arg: "foo", client_id: 1234

   INFO tracing_otel: return: ()
    at src/main.rs:36
    in tracing_otel::print_message with arg: "foo", client_id: 1234

Beyond this simple example, to dive deeper, please check out the video and the tcp-api-server repo to get a sense of how this can all be used in a real world example that has lots of moving parts and pieces. Observability here can tell the story of what happened in the system, so it can be another way of getting an understanding of the system’s behavior.

Build with Naz video series on developerlife.com YouTube channel

If you have comments and feedback on this content, or would like to request new content (articles & videos) on developerlife.com, please join our discord server.

You can watch a video series on building this crate with Naz on the developerlife.com YouTube channel.

Build with Naz : Rust Polymorphism, dyn, impl, using existing traits, trait objects for testing and extensibility

2024-04-28T10:00:00-05:00

Why use polymorphism in Rust?
Short example to illustrate both approaches
Build with Naz video series on developerlife.com YouTube channel

Why use polymorphism in Rust?

When it comes to polymorphism in Rust, which means that you want to be intentionally “vague” about what arguments a function can receive or what values it can return, there are roughly two approaches: static dispatch and dynamic dispatch. They are both tightly related to the notion of sidedness in Rust.

There are many legitimate reasons to be intentionally vague about the types of arguments a function can receive or the values it can return. Here are a few:

Testing: You want swap out the implementation of a function with a test mock or test fixture, so that you can test the function in isolation.
Extensibility: You want to accommodate integrations with other code that you don’t control, and you want to be able to use dependency injection to provide the intended behaviors (from) systems that you don’t control.
Reuse: You want to reuse the same code in multiple places, since they only operate on on aspect (or trait) of the data.

Here are the two approaches to polymorphism in Rust:

static	dynamic
receive	receive
return	return

There are pros and cons to each approach:

approach	pros	cons
static	Compile time checks and dispatch. No runtime overhead.	Code is more difficult to read and write since generics and their often verbose trait bounds have to be spread to the caller.
dynamic	Code is more concise and easier to read and write since the trait objects are localized to the function that accepts or returns them.	Runtime overhead due to dynamic dispatch. Vtable lookup is required due to type erasure.

Here are some helpful links to learn more about this topic:

Video of this in action in the real world

This blog post only has a short example to illustrate both approaches to polymorphism in Rust. To see how these ideas can be used in production code, with real-world examples, please watch the following video on the developerlife.com YouTube channel.

Short example to illustrate both approaches

The code for this example lives here.

Let’s look a single example (that fits in one file) that illustrates both approaches to polymorphism in Rust. You can run cargo new --lib dyn-dispatch to create a new library crate, and then run cargo add rand. Then you can add the following code to the src/lib.rs file.

This first part is the setup for this example. We have two structs, each of which implements the Error trait. We want to be able to use both structs in functions that can receive or return Error trait objects.

use std::error::Error;
use std::fmt::Display;

// ErrorOne.
mod error_one {
    use super::*;

    #[derive(Debug)]
    pub struct ErrorOne;

    impl Display for ErrorOne {
        fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
            write!(f, "ErrorOne")
        }
    }

    impl Error for ErrorOne {}
}
use error_one::ErrorOne;

// ErrorTwo.
mod error_two {
    use super::*;

    #[derive(Debug)]
    pub struct ErrorTwo;

    impl Display for ErrorTwo {
        fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
            write!(f, "ErrorTwo")
        }
    }

    impl Error for ErrorTwo {}
}
use error_two::ErrorTwo;

In some of the code we will need to make a random decision, so we’ll use the rand crate to generate random booleans.

// Random boolean generator.
pub fn random_bool() -> bool {
    rand::random()
}

Here’s the code for the static dispatch approach, using generics, trait bounds, and compiler monomorphisation.

// Static dispatch.
mod static_dispatch {
    use super::*;

    mod receives {
        use super::*;

        pub fn accept_error<E: Error>(error: E) {
            println!("Handling ErrorOne Debug: {:?}", error);
            println!("Handling ErrorOne Display: {}", error);
        }

        pub fn accept_error_with_syntactic_sugar(error: impl Error) {
            println!("Handling ErrorOne Debug: {:?}", error);
            println!("Handling ErrorOne Display: {}", error);
        }
    }

    mod returns {
        use super::*;

        pub fn return_error_one() -> ErrorOne {
            ErrorOne
        }

        pub fn return_error_two() -> ErrorTwo {
            ErrorTwo
        }

        // 🚨 DOES NOT WORK! Need dynamic dispatch.
        // pub fn return_single_error() -> impl Error {
        //     if random_bool() {
        //         ErrorOne
        //     } else {
        //         ErrorTwo
        //     }
        // }

        pub fn return_single_error() -> impl Error {
            return ErrorOne;
        }
    }
}

Finally, here’s the code for the dynamic dispatch approach, using trait objects and vtables to enable runtime polymorphism.

// Dynamic dispatch.
mod dynamic_dispatch {
    use super::*;

    mod receives {
        use super::*;

        pub fn recieve_error_by_ref(error: &dyn Error) {
            println!("Handling Error Debug: {:?}", error);
            println!("Handling Error Display: {}", error);
        }

        pub fn example_1() {
            let error_one = ErrorOne;
            recieve_error_by_ref(&error_one);
            let error_two = ErrorTwo;
            recieve_error_by_ref(&error_two);
        }

        pub fn receive_error_by_box(error: Box<dyn Error>) {
            println!("Handling Error Debug: {:?}", error);
            println!("Handling Error Display: {}", error);
        }

        pub fn example_2() {
            let error_one = ErrorOne;
            let it = Box::new(error_one);
            receive_error_by_box(it);
            let error_two = ErrorTwo;
            receive_error_by_box(Box::new(error_two));
        }

        pub fn receive_slice_of_errors(arg: &[&dyn Error]) {
            for error in arg {
                println!("Handling Error Debug: {:?}", error);
                println!("Handling Error Display: {}", error);
            }
        }
    }

    mod returns {
        use super::*;

        pub fn return_one_of_two_errors() -> Box<dyn Error> {
            if random_bool() {
                Box::new(ErrorOne)
            } else {
                Box::new(ErrorTwo)
            }
        }

        pub fn return_one_of_two_errors_with_arc() -> std::sync::Arc<dyn Error> {
            if random_bool() {
                std::sync::Arc::new(ErrorOne)
            } else {
                std::sync::Arc::new(ErrorTwo)
            }
        }

        pub fn return_slice_of_errors() -> Vec<&'static dyn Error> {
            let mut errors: Vec<&dyn Error> = vec![];
            if random_bool() {
                errors.push(&(ErrorOne));
            } else {
                errors.push(&(ErrorTwo));
            }
            errors
        }

        pub fn mut_vec_containing_different_types_of_errors(mut_vec: &mut Vec<&dyn Error>) {
            mut_vec.push(&ErrorOne);
            mut_vec.push(&ErrorTwo);
        }
    }
}

Build with Naz video series on developerlife.com YouTube channel

If you have comments and feedback on this content, or would like to request new content (articles & videos) on developerlife.com, please join our discord server.

You can watch a video series on building this crate with Naz on the developerlife.com YouTube channel.

YT channel

developerlife.com

Build with Naz : Box and Pin exploration in Rust

Introduction

Why do we need both Box and Pin?

Formatting pointers

What is a smart pointer?

YouTube video for this article

Examples Rust Box smart pointer, and Pin

Example 1: Getting the address of variables on the stack and heap

Example 2: What does Box move do?

Example 3: How do we swap the contents of two boxes?

Example 4: What does pining a box do?

Build with Naz video series on developerlife.com YouTube channel

Build with Naz : Rust async in practice tokio::select!, actor pattern & cancel safety

Introduction

What can go wrong when racing futures?

YouTube video for this article

Examples of cancellation safety in async Rust using tokio::select!

Example 1: Right and wrong way to sleep, and interval

Difference between interval and sleep

Example 2: Safe cancel of a future using interval and mpsc channel

Example 3: Inducing cancellation safety issues

Build with Naz video series on developerlife.com YouTube channel

Build with Naz : Ubuntu 24.04 setup and config for dev productivity

Introduction

Related video

What comes with the scripts

Running the scripts

Gnome Extensions

Keyboard remapping

Tilix and quake mode

Remap Super+Q

Remap Caps Lock to Ctrl

Chrome issues w/ Wayland

libfuse2 and AppImage issues

Approach 1 - simple

Approach 2 - complex

Settings -> Key mappings

OBS Studio issues

Fontconfig

Build with Naz video series on developerlife.com YouTube channel

Build with Naz : Markdown parser in Rust and nom from r3bl_tui

Introduction

nom crate review

A real production grade Markdown parser example

Related video

Architecture and parsing order

The priority of parsers

The “catch all” parser, which is the most complicated, and the lowest priority

Visualize the parsers running on real input

See this in action in r3bl-cmdr

References

Build with Naz video series on developerlife.com YouTube channel

Build with Naz : Rust error handling with miette

Introduction

Rust error handling primer

More resources on Rust error handling

YouTube video for this article

Examples of Rust error handling with miette

Example 1: Simple miette usage

Example 2: Complex miette usage

Parting thoughts

Build with Naz video series on developerlife.com YouTube channel

Build with Naz : Rust typestate pattern

What is the typestate pattern?

More resources on typestate pattern and others in Rust

YouTube video for this article

Examples of typestate pattern in Rust

Example 1: Simple version of this is using enums to encapsulate states as variants

Example 2: Slightly more complex versions are where one type + data = another type

Example 3: Best of both worlds, using generics and struct / enum with a marker trait

Example 3.1: Using enum and PhantomData instead of struct

Parting thoughts

Build with Naz video series on developerlife.com YouTube channel

Build with Naz : Linux io_uring and tokio-uring exploration with Rust

What is Linux io_uring?

YouTube video for this article

Examples of using tokio-uring in Rust

Example 1: Read a file using tokio-uring and async, non-blocking IO

Example 2: Building a TCP echo server using tokio-uring that also uses tokio