Deep Dive #2: Structured Text Logging in Rust

TL;DR: This is a Rust crate that enables you to log text in Rust hundreds of times faster than traditional text logging while avoiding the need for parsing to recover values from your text log. This is not a new idea, but since we couldn't find anything similar in the Rust ecosystem, we built one for everyone, including simple Python bindings for data processing.

So, it is pretty common in your codebase to need to add some “debug” log or text log. Usually you need to do that to keep track of something, debug or simply understand what was the behavior of your codebase after the fact.

    pub fn end_of_processing(&mut self, culistid: u32) {

        // LIKE HERE
        debug!("End of processing for CL #{}", culistid);
        // [...]
        
        self.copper_lists_manager.iter_mut().for_each(|cl| {
            // [...]
            if is_top && cl.get_state() == CopperListState::DoneProcessing {
                cl.change_state(CopperListState::BeingSerialized);
                
                // OR HERE
                debug!("Logging CL #{}", cl.id);
                
                // [...]
            } 
        });
        for _ in 0..nb_done {
            let cl = self.copper_lists_manager.pop();
            // OR HERE
            debug!("Popped CL #{}", cl.unwrap().id);
        }
    }

But this can actually become an issue on runtime critical paths, and also a pain to parse after the fact, let’s deep dive!

Effect of textual logging on runtime performance

Imagine the runtime needs to interpret this in the codebase:

info!("This is the logline {} associated with the log logging and some more = {}, {}", i , i+2, i+3);

To produce this line in the textual logfile:

16:07:54 [INFO] This is the logline 0 associated with the log logging and some more = 2, 3
  1. It will need to reserve some space in a destination buffer.

  2. Take the timestamp in u64, convert it to readable text. Originally: 8 bytes for the timestamp becomes 9 bytes.

  3. then the log level: 1 byte becomes 7

  4. then it needs to copy the string “This is the logline “ to this buffer

  5. then convert the int into chrs 8 bytes will take from 1 byte to 20 bytes!

  6. etc…

So, now your robot, on its critical path is using compute cycles and memory bandwidth just for concatenating strings for the pleasure of an hypothetical human that might read it someday.

A lesson we learned the hard way at Copper: if something is done in real time by the robot that is not immediately useful for the robot, it should never have been done in the first place!

Effects of textual logging on developer’s mental health

And it doesn’t stop there, imagine your boss taps on your shoulder and now needs the statistics for all the values of i … >_<

echo "16:07:54 [INFO] This is the logline 0 associated with the log logging and some more = 2, 3" | sed -n 's/.*logline \([0-9]\+\).*=\s*\([0-9]\+\),\s*\([0-9]\+\).*/\1 \2 \3/p'
0 2 3

ok looks like it works... now for a file ...

sed -n 's/.*logline \([0-9]\+\).*=\s*\([0-9]\+\),\s*\([0-9]\+\).*/\1 \2 \3/p' logfile.txt
[...]

You proudly deliver the damned CSV and your boss will of course answer: “oh and btw can you get me this other one too?!” (┛◉Д◉)┛彡┻━┻

Those are the 2 issues structured logging is addressing.

How does structured logging address that?

We do provide a way build at compile time an index of all the strings your rust code is using in debug!() statements!

When you do…

cargo build

… on a copper project, you see the build system recording at compile time all the strings it encounters on debug statements like

info!("This is the logline {} associated with the log logging and some more = {}, {}", i , i+2, i+3);

And it stores them in a database in the output of your Rust project, here as mentioned in the compile message: target/debug/cu29_log_index

At runtime it won’t record the string but the index you see on the left! So for this line

info!("This is the logline {} associated with the log logging and some more = {}, {}", i , i+2, i+3);
  1. It will record the time of the call in u64 in ns

  2. then the index of the string ie. 4

  3. then i

  4. then i + 2

  5. then i + 3

And that’s it !

And something even better, it is using the bincode encoding: one of the densest and fastest encoding.

for example if i is less than 253, it will be encoded on 1 byte even if it is a u64 !

Is it really faster? Let’s compare

If we loop 1 million times over that logging line with the already pretty awesomely optimized simplelog (this is on a AMD Ryzen 9 7950X3D but the numbers looks similar on something like an ARMv8):

Now the structured logging…

Please not the change of units all of the sudden :)

Total time (ms)   Logged size   Wall Clock per statement
Normal 36198 101MB 36µs
Structured 64 28MB 64ns
Improvement 565x 3.6x 565x

So it is roughly 500x faster and takes 3x less space. Your mileage will vary of course depending on how much text vs structured data you will log.

Ok, Ok, How can I use your thing in my own Rust project??

In your Cargo.toml just add the dependencies:

[dependencies]
cu29-traits = "*" # please pin the revision :)
cu29-log = "*"  # The common logging interfaces
cu29-log-derive = "*"  # This is where the magic happens at compile time
cu29-log-runtime = "*"  # this is what will run in your rust program
cu29-clock = "*"  # This will give a high precision timer for your log lines

Then you will need a small hook in a build.rs to tell the compiler where to put the index at compile time, we will just put it in the out dir:

use std::env;
fn main() {
    let out_dir = env::var("OUT_DIR").unwrap();
    println!("cargo:rustc-env=OUT_DIR={}", out_dir);
}

Then that's it, you can use your new structured logger:

use std::path::PathBuf;
use cu29_clock::{CuTime, RobotClock};
use cu29_log_derive::debug;
use cu29_log_runtime::SimpleFileWriter;
use cu29_log_runtime::LoggerRuntime;

// This is where you want to put your structured log
// note this is NOT a text file
const LOG_FILE: &str = "./logfile.bin";

fn main() {
    // The reference clock for the log
    let clock = RobotClock::new();
    let writer  = SimpleFileWriter::new(&PathBuf::from(LOG_FILE)).unwrap();
    let _log_runtime = LoggerRuntime::init(clock.clone(), writer, None); // you simply give it the clock and the path
    for i in 0..1_000_000 {
        debug!("This is the logline {} associated with the log Logging and some more = {}, {}", i , i+2, i+3);
    }
    // And that's it! Note that when _log_runtime will drop it will do a final flush and close the file
}

You can run your program and it should record all those log statements in logfile.bin.

And we now have python bindings for you to directly do your data analysis, check out this example, it is trivial to reread the structured log and rebuild the strings, add that to some cloud system etc…

import cu29_export

log_file_path = "logfile.bin" # Your log file
index_file_path = target_dir / "cu29_log_index" # the index generated at compile time

log_iterator, all_strings = cu29_export.struct_log_iterator_bare(log_file_path, str(index_file_path))


for log_entry in log_iterator:
    message = all_strings[log_entry.msg_index()]
    parameters = log_entry.params()

    try:
        formatted_message = message.format(*parameters)
    except IndexError as e:
        formatted_message = f"Error formatting message: {e}"

    print(f"{log_entry.ts()}: {formatted_message}")

Running that should give you back the equivalent of cat logfile.txt

This was cool as a standalone, but how does it work integrated with Copper the robot engine?

We do have a destination attached to the unified logger that stripes this data stream into the main logger. On the python side you just have to use cu29_export.struct_log_iterator_unified instead of bare.

See here the end to end deployment of copper to follow along, we do also generate on Copper a log reader in rust directly with your project so it is mainly hidden from you except if you need to customize it.

We hope you enjoyed the deep dive and if you have questions, feel free to jump in our chatroom!

Previous
Previous

Mac OS preliminary support + how to cross compile your Copper project.

Next
Next

Deep Dive #1: Data-Oriented Design for a Robot Engine