Why use observability in async Rust? #

In synchronous systems, it’s often easy to understand the flow of execution by looking at log messages. For example, if a thread walking through a single function ends up calling a whole host of other functions, and they all emit log messages, you can often piece together what happened by looking at the log messages in order.

However, in asynchronous systems, this is challenging. When using Tokio, for example, different threads might be executing the same task, as it goes from being parked, to being woken up, to being parked again. Both temporality (when a log event happened) and causality (what caused the event) get muddled. This is where observability comes in, provided by Tokio tracing crate and OpenTelemetry (OTel) crates.

The tracing crate expands upon logging-style diagnostics by allowing libraries and applications to record structured events with additional information about temporality and causality. Unlike a log message, a Span in tracing has a beginning and end time, may be entered and exited by the flow of execution, and may exist within a nested tree of similar spans.

For representing things that occur at a single moment in time, tracing provides the complementary concept of events. Both Spans and Events are structured, with the ability to record typed data as well as textual messages.

Tokio tracing usage #

Code:

Here’s an example of using the tracing crate. Some key symbols to note are::

  • #[instrument] attribute is used to create a span.
  • Span::current().record() is used to add fields to the span (when the function is running, and this information is not known statically beforehand).
  • info!, error!, etc are used to emit log or tracing events. However, these are not used to create spans; they are used to emit events within a span.
  • #[tokio::main] is used to run the async main function.
use tracing::{info, instrument, Span};

#[tokio::main]
async fn main() {
    // Set up the tracing subscriber, so you can see the output of log events in stdout.
    // https://docs.rs/tracing-subscriber/latest/tracing_subscriber/fmt/fn.fmt.html
    tracing_subscriber::fmt()
        .with_test_writer()
        .with_env_filter("info")
        .init();

    // Call the entry point function.
    client_task::entry_point(1234).await;
}

mod client_task {
    #[instrument(name = "caller", skip_all, fields(?client_id))]
    pub async fn entry_point(client_id: usize) {
        info!("entry point");
        more_context("bar").await;
        handle_message(client_id, "foo").await;
        no_instrument("baz").await;
    }

    #[instrument(name = "callee", skip_all, fields(%message))]
    pub async fn handle_message(client_id: usize, message: String) {
        info!("handling message");
    }

    #[instrument(fields(extra))]
    pub async fn more_context(extra: &str) {
        CurrentSpan::current().record("extra", &extra);
        info!("more context");
    }

    pub async fn no_instrument(arg: &str) {
        info!("no instrument fn");
    }

}

Here are some key points to remember when using tracing from the code above:

  • You have to be careful about recording the same field multiple times, in an async call chain. In the example above, client_task::entry_point() is the entry point, and is the only function that should log the ?client_id; ? means debug. And not any other functions that it calls, like handle_message().

  • When you call entry_point(), it will call handle_message(), and the span that is generated by handle_message() will have the client_id field added to it, because of the call chain. So the output of info!("handling message") will have the client_id included in it (for free). It will also have the %message field in it; % means display. You don’t have to explicitly add either of these fields to the info!() call 🎉.

  • If you use the client_id field in multiple #[instrument..] attributes in functions (that are in the call chain), then this will show up multiple times in the log output (when using info!, debug!, etc) of the leaf function in the call chain. So when you see the same fields showing up multiple times in the output from info!, debug!, etc, then you know that you have to remove that field from the #[instrument..] attribute somewhere in the call chain (that the span covers).

  • You have to be careful about how to use [#instrument] attribute with tracing::Span::record. You have to call tracing::Span::current().record("foo","bar") in the same function where the #[instrument(fields(foo))] attribute is used.

  • When a function is called that isn’t instrumented, by another one, which is, any log events generated in the un-instrumented function will be associated with the span of the instrumented function. In the no_instrument function’s log output, you will see addition context from the entry_point function that looks something like INFO caller{client_id=1234}: no instrument fn.

Here are some helpful links to learn more about this topic:

Video of this in action in the real world #

This blog post only has a short example to illustrate how to use Rust tracing and OTel with Jaeger. To see how these ideas can be used in production code, with real-world examples, please watch the following video on the developerlife.com YouTube channel.

Here’s the code for this real world example:

Short example to illustrate the use of tracing and OTel in Rust #

Let’s look a single example (that fits in one file) that illustrates the use of tracing in Rust. You can run cargo new --lib tracing-otel to create a new library crate, and then run the following:

cargo add miette --features fancy
cargo add tracing tracing-subscriber
cargo add tokio --features full

Then you can add the following code to the src/main.rs file.

use miette::IntoDiagnostic;
use tracing::Span;

#[tokio::main]
async fn main() -> miette::Result<()> {
    let subscriber = tracing_subscriber::fmt()
        .without_time()
        .pretty()
        .with_max_level(tracing::Level::DEBUG)
        .finish();

    tracing::subscriber::set_global_default(subscriber).into_diagnostic()?;

    print_message("foo").await;

    Ok(())
}

The first part of the code sets up the tracing subscriber. In this case we are using a formatting subscriber that prints logs to the console. This subscriber is configured to not print the time of the log message, to pretty print the logs, and to print logs at the DEBUG level or higher.

When you use #[attribute] along with info!, debug!, etc, Tokio will emit log events that are associated with a span. This is the “emitter” side of the process. The other side is the “subscriber” side, which is where the logs are actually printed to the console, or sent to a file, or sent to an OTel collector service like Jaeger (using OTLP protocol over gRPC).

Tokio tracing allows us to use this simple default subscriber, or create our own custom subscribers. It even allows a subscriber to be composed from layers. We can create our own custom layers, or use some default ones (like the level filter layer).

OTel is itself a tracing layer. In the video & tcp-api-server repo, you will see how to use OTel with Jaeger, and how to configure the OTel layer with a custom layer.

Next you can add the following code to the src/main.rs file.

#[tracing::instrument(fields(arg = ?arg, client_id), ret)]
async fn print_message(arg: &str) {
    tracing::info!("log message one");
    println!("{}", prepare_message().await);

    Span::current().record("client_id", 1234);

    tracing::warn!("log message two");
}

#[tracing::instrument(ret)]
async fn prepare_message() -> String {
    tracing::debug!("preparing message");
    let it = "Hello, world!".to_string();
    tracing::debug!("message prepared");
    it
}

The print_message function is annotated with the #[tracing::instrument] attribute. This attribute creates a span for the function, and adds the arg field to the span along with the client_id field. In all the log events are emitted within the span, the arg and client_id field will be included in the log output. This additional context is provided by a span. And you don’t have to write any code to the info!, warn!, etc calls to include these fields in the log output.

You can run the code using cargo run. The code will produce the following output.

   INFO tracing_otel: log message one
    at src/main.rs:38
    in tracing_otel::print_message with arg: "foo"

  DEBUG tracing_otel: preparing message
    at src/main.rs:48
    in tracing_otel::prepare_message
    in tracing_otel::print_message with arg: "foo"

  DEBUG tracing_otel: message prepared
    at src/main.rs:50
    in tracing_otel::prepare_message
    in tracing_otel::print_message with arg: "foo"

   INFO tracing_otel: return: "Hello, world!"
    at src/main.rs:46
    in tracing_otel::prepare_message
    in tracing_otel::print_message with arg: "foo"

Hello, world!

   WARN tracing_otel: log message two
    at src/main.rs:43
    in tracing_otel::print_message with arg: "foo", client_id: 1234

   INFO tracing_otel: return: ()
    at src/main.rs:36
    in tracing_otel::print_message with arg: "foo", client_id: 1234

Beyond this simple example, to dive deeper, please check out the video and the tcp-api-server repo to get a sense of how this can all be used in a real world example that has lots of moving parts and pieces. Observability here can tell the story of what happened in the system, so it can be another way of getting an understanding of the system’s behavior.

Build with Naz video series on developerlife.com YouTube channel #

You can watch a video series on building this crate with Naz on the developerlife.com YouTube channel.

👀 Watch Rust 🦀 live coding videos on our YouTube Channel.



📦 Install our useful Rust command line apps using cargo install r3bl-cmdr (they are from the r3bl-open-core project):
  • 🐱giti: run interactive git commands with confidence in your terminal
  • 🦜edi: edit Markdown with style in your terminal

giti in action

edi in action

Related Posts