Quantcast
Channel: Nicholas Blumhardt
Viewing all 102 articles
Browse latest View live

Reading logger configuration from appsettings.json

$
0
0

ASP.NET Core has a completely new configuration subsystem that adds JSON, environment variables and other sources in addition to the classic XML format we’ve had since .NET 1.0. The appeal of Microsoft.Extensions.Configuration for me is that it’s easier to work with than the System.Configuration APIs.

Having the flexible JSON format is also a big plus, and it makes Serilog configuration on .NET Core quite a bit more natural than the XML <appSettings> support that ships in 1.5.

This post is about Serilog 2.0, currently in beta. Unless you’re targeting ASP.NET Core, you probably need the Serilog 1.5 package. But, all of this stuff will land later this year, so read on!

Here’s the Serilog configuration for colored console logging:

Log.Logger = new LoggerConfiguration()
  .WriteTo.ColoredConsole()
  .CreateLogger();

With the new Serilog.Settings.Configuration package, ReadFrom.Configuration() lets us pull the same information from appsettings.json.

var configuration = new ConfigurationBuilder()
  .AddJsonFile("appsettings.json")
  .Build();

Log.Logger = new LoggerConfiguration()
  .ReadFrom.Configuration(configuration)
  .CreateLogger();

Log.Information("Hello, world!");

In the JSON file itself, the relevant section is called "Serilog":

{
  "Serilog": {
    "WriteTo": ["ColoredConsole"]
  }
}

Simple, hey? As long as the sink package, in this case Serilog.Sinks.ColoredConsole is installed, the right assemblies will be found. The names of sinks (and the arguments they might accept) are the same as the method names in the C# API, because behind the scenes the JSON configuration is translated into exactly the same method calls as you’d make directly in C#.

Let’s look at a more complete example:

{
  "Serilog": {
    "MinimumLevel": "Debug",
    "WriteTo": [
      { "Name": "ColoredConsole" },
      { "Name": "Seq", "Args": { "serverUrl": "http://localhost:5341" } }
    ],
    "Enrich": ["FromLogContext", "WithMachineName", "WithThreadId"],
    "Properties": {
        "Application": "Sample",
        "Environment": "Test"
    }
  }
}

This one specifies a minimum level, and shows how "WriteTo" works to specify more complex sink configuration. Instead of just the name of the sink some arguments are included too. Seq is one of the network-based sinks for Serilog, and WriteTo.Seq() accepts a serverUrl argument in the C# API, so that’s passed under "Args" in the JSON version.

"Enrich" works the same way as "WriteTo": the options here are the same ones that you’ll see on the Enrich configuration API in C#, and arguments can be passed in the same way if they’re needed.

As a syntactic shortcut, Enrich.WithProperty("Application", "Sample") in C# is specified using the JSON "Properties" collection. (This has the side-effect of meshing nicely with environment variable configuration support, which we get “for free” by supporting the IConfiguration interface. More on that another day…)

What do you think? Let me know in the comments here, or join the Gitter chat we’ve just opened up for the Serilog project.


Exploring Rust

$
0
0

For the past few years I’ve followed the evolution of Rust with interest. Rust is a new systems programming language (like C or C++) with greater emphasis on safety. I’ll spare you my reasons and opinions on its relevance since the Internet is replete with those :-)

Instead, I thought I’d write this brief post to share my perspective on learning Rust as a developer who’s otherwise mostly writing C#. The post assumes you’ll jump over and check out the Rust documentation where topics are necessarily summarised here.

No, this isn’t one of those “I’m leaving .NET! …for Rust!” kinds of posts. Rust and C# are different enough to be complimentary, in much the same way C# and JavaScript are. Learning Rust also involves digesting concepts that have (so far) escaped C#’s all-consuming multi-paradigm expansion. Perhaps there are some ideas to bring back and apply in .NET, too?

So, you’re comfortable with C# – how does the knowledge you already have map onto this new thing? Here are my notes after spending a few days writing some Rust.

Projects, builds and compilation

The first thing you’ll probably write in Rust is not in the language itself, but at the command-line:

> cargo new --bin hello
...
> cd hello
> cargo run
...
Hello, world!

Cargo is Rust’s package manager, and combined with the repository at crates.io, it plays the role of NuGet in the Rust ecosystem.

Unlike C# and .NET, Rust came of age in the post-NPM world. The Rust standard library is smaller than the original .NET one, and feels a lot more like the upcoming .NET Core. HTTP clients, web frameworks, async I/O and JSON serialization live out in the crates.io ecosystem where they can evolve and improve through Darwinian pressures.

You’ll also probably note that NuGet.exe doesn’t have a run command like the one above. This is another parallel between Rust and .NET Core: the cargo command-line (like dotnet) provides a standardised interface to the whole toolchain including dependency management, builds, a test runner and so-on.

Testing

In C#, adding unit tests to a project is a decision point. In Rust, the decision is made for you, and every (Cargo) project gets testing built in. Tests sit right alongside the code being tested, often in the same source file.

Here’s an example:

fn add(a: i32, b: i32) -> i32 {
    a + b
}

#[test]
fn two_plus_two_makes_five() {
    let result = add(2, 2);
    assert_eq!(result, 5);
}

Side-by-side tests are not just a feel-good novelty: including tests in the same compilation unit as the code-under-test means there are no visibility issues to worry about. No need for internal and [InternalsVisibleTo] just to drive unit tests!

Code and its tests also always move together – no more parallel folder hierarchies in test projects to keep in sync. I like this enough to miss it in C# projects already. Conditional compilation in Rust means there’s no issue with code or dependency bloat, something that might be trickier to handle cleanly in the .NET world.

The baked-in assertion macros are very basic – I’m sure there are alternatives but in my limited exploration so far I haven’t felt the need for anything more.

Syntax

Rust has an expression-oriented curly-braces syntax. If you stare at C# (or C/C++, or Java, or JavaScript) all day, Rust isn’t that surprising. It’s not a language you’d immediately recognize by its syntax, like Python, Lisp or Smalltalk.

Coming at Rust from C# you might find the name: Type declaration order, -> Type return type declarations and implicit “return” unusual, but the tiny add() function above uses all of these and the meaning is pretty clear.

Approaching the Rust syntax is very much like approaching the syntax of any new language – it takes some time. So far nothing has left me scratching my head for too long.

Move semantics, lifetimes and borrowing

Here’s where Rust begins to diverge from “nice, generic, modern language” to strike out on its own. Two of Rust’s big selling points – memory safety without garbage collection, and freedom from data races – rely on statically determining the “owner” of an object as well as possible references to it (aliases).

let c1 = vec![1, 2, 3];
let c2 = c1;

// Fails to compile - the value has been moved out of c1 into c2
println!("Count: {}", c1.len());

You get the general idea in the example above: assignments in Rust are moves by default. This takes a bit of getting used to: in C#, assigments are always copies either of a simple value or a reference to an object, so after assignments in C#, both the source and destination variables are still usable.

For memory safety, the advantage of move semantics is that the compiler knows exactly when a value goes “out of scope” and can insert code to deallocate it at precisely the right moment, without needing a GC to scan for other references.

Now, this is obviously a pretty limited model, so Rust has a clever system of lifetimes and borrowing to make more complicated patterns expressible. A proper treatment of the idea is well beyond what I could attempt in this post (already starting to spill beyond the initial goal of “brief”), but with the risk of inadvertently bungling the analogy I’d like to try to translate the concept of a “lifetime” into C# terms, since I haven’t seen this attempted elsewhere.

Ready? Read the ownership, borrowing, and lifetimes chapters of the Rust book? Ok, here it is.

Dispose-safety in C#

Let’s imagine a typical use of the using statement in C#:

using (var archive = OpenZipArchive(@"Cbackup.zip"))
{
    using (var index = archive.OpenFile("index.json"))
    {
        var contents = ReadBackup(archive, index);
        // Read files from contents
    }
}

It’s a bit contrived, but you can imagine the object returned from ReadBackup(archive, index) may have a reference to either archive or index.

That’s fine in the example above, since we know at the time we // Read files from contents that nothing has been disposed yet, so all is well.

How about this code?

using (var archive = OpenZipArchive(@"Cbackup.zip"))
{
    BackupReader contents;
    using (var index = archive.OpenFile("index.json"))
    {
        contents = ReadBackup(archive, index);
    }
    // Read files from contents
}

Here, we might have a problem. The lifetime represented by the second using statement has finished, and index has been disposed. If the BackupReader holds a reference to index then we’ll have trouble, probably in the form of an ObjectDisposedException when we try to use it.

How can we tell if this code is safe? In C#, we can’t – statically – without examining the code of ReadBackup() and potentially any code it calls.

The analogy is not precise, since in Rust it’s deallocating the memory rather than calling Dispose() that we’re concerned with, but otherwise, this is the safety issue Rust lifetimes solve. The Rust compiler must know, when examining this code, whether the second usage of contents is valid, because allowing it otherwise could lead to reading from memory that’s already been freed.

Before translating the example into Rust, let’s invent a hypothetical dispose-safety checker for C#, where we can represent these kinds of things in method signatures using attributes. Here’s the original:

static BackupReader ReadBackup(Archive archive, Index index)
{
    var manifestFilename = index.GetManifestFilename();
    return new BackupReader(archive, manifestFilename);
}

Ah – so the index argument is only used to determine a filename, and the result only depends on index. Here’s the “dispose-safe C#” that expresses this:

[return: In("a")]
static BackupReader ReadBackup([In("a")] Archive archive, [In("b")] Index index)
{
    var manifestFilename = index.GetManifestFilename();
    return new BackupReader(archive, manifestFilename);
}

We’ve added some labels like [In("a")] to the arguments. These represent which using block the value came from. “Dispose-safe C#” requires these whenever a method accepts an IDisposable argument.

Annotating the return value with [In("a")] means that the result is valid in the same scope that the archive parameter is valid in – both are tagged with a.

The return value doesn’t have any relationship at all to index, so the b tag doesn’t appear anywhere else.

Back at the call-site, the checker does the mapping of labels to using blocks for us implicitly:

// 1
using (var archive = OpenZipArchive(@"Cbackup.zip"))
{
    BackupReader contents;

    // 2
    using (var index = archive.OpenIndex()))
    {
        // Infers that "a" is block 1 and "b" is block 2
        contents = ReadBackup(archive, index);
    }

    // Thus, contents depends only on block 1, so this is fine
    // Read files from contents
}

Now, what would have happened if ReadBackup() had a different implementation?

[return: In("b")]
static BackupReader ReadBackup([In("a")] Archive archive, [In("b")] Index index)
{
    var files = archive.LoadAllFiles();
    return new BackupReader(files, index);
}

In this version of the method, the returned BackupReader is declared to depend on b, the lifetime of the index argument. The mapping of the using statements is the same, but the code is no longer valid and the checker will reject it:

// 1
using (var archive = OpenZipArchive(@"Cbackup.zip"))
{
    BackupReader contents;

    // 2
    using (var index = archive.OpenIndex())
    {
        // Infers that a is block 1 and b is block 2
        contents = ReadBackup(archive, index);
    }

    // Error: contents depends on block 2, which has ended
    // Read files from contents
}

By giving the checker a bit more information about the intentions of the code, it’s helpfully saved us from an ObjectDisposedException at runtime — awesome!

Lifetime variables in Rust

I hope this little “dispose-safe C#” thought experiment sheds some light now on the corresponding (memory-safe) Rust:

fn read_backup<'a,'b>(archive: &'a Archive, index: &'b Index) -> BackupReader<'a> {
    let manifest_filename = index.manifest_filename();
    BackupReader::new(archive, manifest_filename)
}

Don’t let the reuse of the angle-brackets “generic” syntax throw you. The little 'a and 'b annotations in there are lifetime variables, exactly the same as the [In("a")] and [In("b")] examples in C#.

At the call-site, there’s no special using-like statement to delineate lifetimes in Rust, which are implied by the natural boundaries of blocks:

// 1
let archive = open_zip_archive("C\backup.zip");
let contents: BackupReader;

// 2
{
    let index = archive.open_index();
    contents = read_backup(&archive, &index);
}

// Read files from contents

When the compatible implementation of read_backup() is used, this compiles fine; when the second implementation of read_backup() is used here, the Rust compiler rejects the code with error: index does not live long enough.

I hope my analogy hasn’t stretched too far and led you astray, as my own understanding of Rust is far from perfect. My hope is that it makes the whole lifetime concept a bit more approachable. Ownership, borrowing and lifetimes are interesting concepts that it are worth spending some time on.

The stack and the heap

There’s one more place I’ve spotted where C# and Rust differ fundamentally, and that’s how memory is allocated between the stack and the heap.

C# has two families of data structures that dermine allocation behaviour: structs and classes. If a type is a struct then values of that type will be allocated on the stack:

// Here, now is the actual DateTime value
var now = new DateTime(2016, 03, 23);

If a type is a class, then values of that type are allocated on the heap, and we work with it through references:

// Here, user is a pointer to some memory holding a User
var user = new User("nblumhardt");

C# also has an “escape hatch” for moving structs to the heap as well, in the form of ‘boxing’:

// Here, now is a pointer to memory holding the DateTime
var now = (object)new DateTime(2016, 03, 23);

Boxed structs in C# however are a bit of a second-class citizen – there’s no static type to represent them, so they end up as objects.

After the last section, you might feel some relief to hear that Rust is simpler on this point. In Rust, all values are essentially C# structs:

// Here, user is the actual User value
let user = User::new("nblumhardt");

To move a value onto the heap, as in C#, the value is boxed:

// Here, user is a Box, pointing to memory on the heap
let user = Box::new(User::new("nblumhardt"));

Boxed values are first-class in Rust, so in practice they’re more like C#’s classes in usage, enjoying the benefits of static typing for example.

The declaration and construction syntax for structs in Rust is quite a bit different from the C# equivalent, but I’ve found that getting clear about the differences in the allocation model has made the biggest difference to my understanding of Rust code.

Language machinery

Rust is a modern language. Its designers have drawn widely from existing languages and experience to exclude the worst footguns (nullable references, mutability-by-default) and include the best available tools for abstraction (closures, generics, traits, pattern matching, algebraic data types and more).

One of the refreshing aspects of Rust is that like C#, it’s a multi-paradigm language with a functional features. Many of the new languages I’ve encountered in recent years are presented as functional languages, which is nice enough, but there’s a place in the future for variety.

I’d love to continue poking at different corners of Rust with this post, but writing it’s already pushing at the limits of my attention, so before I lose your interest too I’d better wrap up.

In closing

I’m having a great time exploring Rust. If you’re working in one of the currently popular industrial languages like C#, and want to learn something that’s different in more than just syntax, Rust could be what you’re looking for. Have fun!

Serilog Timings

$
0
0

TL;DR: I’ve just kicked off development of Serilog Timings, a tiny package that adds streamlined operation timings on top of Serilog.

Though there are a million ways to collect timing data from applications, it’s often very useful to include it in regular logs:

  • Logs are visible all the time, even in development, while there isn’t always a metrics server handy
  • Logs make it easy to trace individual timings back to the conditions and processes that caused them
  • Structured logs events already carry the contextual properties around which the timings can be sliced and diced

It will come as no surprise that there are already a few options for recording timing information through Serilog. At its most basic, System.Diagnostics.Stopwatch and a handwritten log event can suffice. I’ve used SerilogMetrics, which includes timings along with counters and gauges, and I’ve also written several hand-rolled implementations for various apps.

As I’ve done this, the picture of what I want from operation timings has become a lot clearer:

Limited noise – when timings are recorded, it’s easy to end up with a crowd of log events: one declaring the operation started and assigning it an identifier, one describing the operation with a bunch of app-specific data, and finally the event completing operation with the elapsed time. Serilog Metrics was built with the goal of compressing all of this into a single event.

Natural messages – related to the first goal, I don’t really want generic “operation started” and “operation ended” messages in the log. Logs need to be optimized for human pattern recognition – the message printed by one operation should be visually distinct from any other so that they’re easy to pick out from a large body of text.

Stable event types – furthermore, the event type (essentially a hash of the message template) should be stable for a particular operation, regardless of whether the operation succeeded, failed, or completed with a custom result. Stable (and unique) event types make it much easier to find all instances of an operation, even though “successful” ones may vastly outnumber and hence obscure the unsuccessful ones.

A compact syntax, .NET Core support, LogContext integration, custom results, … okay, I’m particular in the extreme … you kinda get the picture 🙂

I won’t promise it’s the last attempt I’ll ever make to cover all of this, but Serilog Timings ticks all of the boxes right now. Here’s how it’s used.

Once you’ve configured Serilog and installed the NuGet Package, wrap the code you want to time with a using block and describe the Operation:

using (Operation.Time("Submitting payment for {ClaimId}", claim.Id))
{
    // Timed code goes here...
}

When the using block is completed, an event will be written to the log that looks like this one:

11:20:01 [INF] Submitting payment for claim-41546 completed in 11.2 ms

The description of the operation is a regular Serilog message template, which Serilog Timings extends with " {Outcome} in {Elapsed:0.0} ms". The properties from the description, plus the Outcome and Elapsed properties, are all included on the event if it’s sent to a structured log collector.

The completion event and any other events written within the block will be tagged with an OperationId property as well, if the Serilog LogContext class is enabled.

Operation.Time() assumes the operation always succeeds, so the message will be identical regardless of whether or not an exception is thrown.

Operations that need to distinguish between timings for successful completion vs failures use Operation.Begin() and Complete():

using (var op = Operation.Begin("Saving invoice {Filename} to blob storage", filename))
{
    // Timed code goes here...
   
    op.Complete();
}

Successfully completing the operation will produce an event exactly like the one in the first example. If Complete() is not called on an operation started with Begin(), perhaps because an exception was thrown, then it will be logged as abandoned:

11:29:37 [WRN] Saving claim-41456 invoice INF_CX7890.pdf to blob storage abandoned in 30978.8 ms

Notice that although the Outcome property is "abandoned" instead of "completed", and the level is warning, the message template (and thus the event type) is identical to the one from the success case. This means that while completed and abandoned timings can still be differentiated, given one of them it’s always simple to find the other.

To bail out of an operation altogether and suppress the warning message, an operation can be Cancel()ed.

Complete() can optionally accept a result property name and value to added to the event, so for example if you’re timing a network request, you can tag the event with BytesReceived, running a SQL query you might tag the event with Rows and so-on. More detailed instructions and documentation are in the project README.

Events generated by Serilog Timings display nicely in log files or the console. If you collect logs using SQL Server or another database you might elevate the Elapsed property to column-level for some interesting analysis opportunities (the Serilog sink for SQL Server recently acquired property-to-column mapping capability).

This screenshot from Seq shows a timing event from my test app with all of its attached information:

Data

The stable event type (B7B24E3B shown as Type in the screenshot above) lets us drill down to just the timings of the operation we’re interested in:

ByType

And once we’ve pushed this over into a signal we can run more interesting queries to follow the Elapsed value over time:

Aggregate

(Seq 3.1 just added the timeseries charting shown here, so my current obsession is trawling through historical log data to see what kinds of trends I can detect 😉 .)

What do you think? I’d love to hear from you if you give Serilog Timings a try, and issue reports or PRs to the GitHub project are welcome. Have fun!

Which logging abstraction should I use?

$
0
0

If you’re writing .NET apps or libraries for your own use, well, you can choose and use whatever logging facility you like.

If you want to write packages for others to consume, your best bet is to avoid API designs that call for logging at all. Strong invariants, exceptions and unsurprising behavior will inform users a lot more than log messages, which they may not always collect or see.

Some kinds of libraries, especially framework-style ones that call user code rather than the other way around, can’t make this simplification. If you’re writing the next NancyFx, Topshelf, or MassTransit, you’ll face the problem of how to write log events for developers to consume.

Applications get to choose the logging subsystem, and your library has play well with that.

LibLog1

In the past, we’ve dealt with this through abstractions like Common.Logging. The abstraction defines an interface, and a framework-specific adapter implements this in terms of whatever logging library is in use:

LibLog2

While logging abstractions have been a great asset to the ecosystem, they suffer some serious downsides.

First, abstractions target the lowest common denominator. Why do you choose one logging library over another? Often it’s because you value some power that its API affords. Abstractions usually have to wipe away these advantages in order to fit a variety of implementations.

Second, abstractions bring dependencies. A single-assembly or single-package library that supports a shared abstraction now becomes three or more dependencies in actual usage: the library, the abstraction library, the binding between the abstraction and the logging library, and the logging library itself. All of these need to be maintained and versioned.

Third, abstractions proliferate. Because of the first problem, someone’s always looking to create a better abstraction, and because of the second problem, this leaves applications peppered with implementations of multiple similar abstractions.

Given this unsatisfactory situation, most popular libraries end up defining an interface that they expose for the user to plug logging in. The library can define the abstraction at the level it considers “ideal”, and since the abstraction is part of the library at least some dependency proliferation is avoided.

Still, this isn’t a great place for applications to be. Libraries get to stay relatively “clean”, but now since every library potentially defines its own logging interface, the job of binding all of these and wiring the result up is a chore. Making matters worse, IntelliSense for `ILog…` in Visual Studio now presents a dozen suggested types to choose from. Every time. Ick.

Enter LibLog.

LibLog is not Serilog, so why am I writing about it? Well, it’s a bit cynical of you to ask 😉 but since you did: LibLog is different. Serilog is for apps. LibLog is built specifically for libraries, to address the pain points I mentioned above. I’m an admirer because we’ve lived with this problem for so long, I’d given up on an elegant solution ever emerging.

When you install the LibLog package, a source file is embedded directly into your project. There’s no third-party assembly or runtime dependency at all. The source file defines an ILog that is internal to your project: there’s nothing polluting your public API surface. The source file also includes some reflection-based code to sniff out which concrete logging library is actually in use, and some Expression-based code generation to efficiently connect to it.

LibLog3

All of this happens automatically, behind the scenes, without a user of your library having to know or care how things are wired up. It’s like magic! Its the future.

LibLog is also smart: you can use message templates with it to do structured logging a-la Serilog, Microsoft.Framework.Logging, Logary 4, and maybe even a future version of NLog.

using ClassLibrary1.Logging;

namespace ClassLibrary1
{
    public class Class1
    {
        public static void SayHello(int number)
        {
            var log = LogProvider.For<Class1>();
            log.InfoFormat("Hello, the number is {Number}", number);
        }
    }
}

Used in an application with Serilog, this event has the Number property and its value attached. If the underlying logger doesn’t support it, the message is translated into a regular .NET format string and passes through with full fidelity.

If you want something more from your logging interface – say, you want to be able to pass an integer event type with every message to cater to some new framework – it’s just a matter of tweaking the code file that’s already part of your app. Admittedly there’s got to be a common denominator, but you get to choose what it is, and it doesn’t have to be the lowest.

What are the downsides of using LibLog? Your library gains a few kilobytes of adapter code internally that need to be carried around. The weight of the code is comparable to what an external abstraction would cost, so it’s zero-sum in reality.

Why is LibLog the future? From where we stand right now, amidst a mature ecosystem of .NET software, no one – not Microsoft, nor you, nor me – can define the one logging abstraction to rule them all. We can’t put the genie back into the bottle! We have so much code between us in libraries and applications that all of the logging libraries we have, the abstractions, adapters and other gunk, will live on practically forever. Any new abstraction, no matter how clean, will just add one more to the long list we already drag along with us.

So, how about letting your library’s users off the treadmill? LibLog for libraries, Serilog (if you like it) for apps. Couldn’t have said it better myself :-).

To get started with LibLog check out the wiki or this brief introduction from Christos Matskas.

Aggregate Queries in Seq Part 5: Execution

$
0
0

Part 5 was very nearly the stalling point in this blog series. I’ve got enough of the implementation done that I can see the finish line, and I’m eager to get that build out, but to really finish the story I need to fill in this installment. If this post is a little brief, please read it as a “status report” this time around 🙂

I’ve also had a bit of time now to revisit decisions made in the earlier stages of building this feature. I had some honest and valuable feedback from Michael Chandler on Twitter regarding the “SQL-like” nature of the syntax:

Upon reflection I think it will be easier to explain how to use aggregate queries if the language simply is SQL, or a dialect thereof, anyway. So, I’ve been back to rework some of the parser and now in addition to the C#-style expression syntax, typical SQL operators such as =, and, or, like, and not as well as single-quoted strings are available:

select count(*)
where Environment = 'production' and not has(@Exception)
group by time(7d), Application

There are still some questions to answer around how much this flows back the other way into typical filter expressions. On the one hand, it’d be nice if the filter syntax and the where clause syntax were identical so that translating between queries and filters is trivial. On the other hand, keeping the languages a bit tighter seems wise. For now, the syntaxes are the same; I’m going to spend some time using the SQL syntax in filters and see how it goes in practice.

Anyway, back to the topic at hand. Now we’re getting somewhere! The aggregate query parser handles the syntax, the planner can produce a query plan, and we need to turn that into a result set.

This post considers three questions:

  • What inputs are fed into the executor?
  • What does the result set look like?
  • How is the result computed?

The first turns out to be predictably easy, given the efforts expended so far to generate a plan, and the existing event storage infrastructure. Keeping things as simple as possible:

static class QueryExecutor
{
    public static QueryResult Execute(
        QueryPlan plan,
        IEventStore store,
        DateTime rangeStartUtc,
        DateTime rangeEndUtc)

Here plan is the output of the last step, store is a high-level interface to Seq’s time-ordered disk/RAM event storage, and the two range parameters the time slice to search. (The implementation as it stands includes a little more detail, but nothing significant.)

A QueryResult lists the column names produced, and either a list of rows, or a list of time slices that each carry a list of rows:

Results

I decided for now to keep the concept of “time slice” or sample separate (time could simply have been another column in the rowset) because it makes for a friendlier API. I’m not sure if this decision will stick, since tabular result sets are ubiquitously popular, but when “series” are added as a first-class concept it is likely they’ll have their own more optimal representation too.

In between these two things – an input query plan and an output result – magic happens. No, just kidding actually. It’s funny how, once you start implementing something, the magic is stripped away and things that initially seem impenetrably complex are made up of simple components.

The core of the query executor inspects events one by one and feeds the matching ones into a data structure carrying the state of the computation:

AggregationState

First, the group that the event belongs to is determined by calculating each grouping expression and creating a group key. Against this, state is stored for each aggregate column being computed. The subclasses of Aggregation are themselves quite simple, like count():

class CountAggregation : Aggregation
{
    long _count;

    public override void Update(object value)
    {
        if (value == null)
            return;

        ++_count;
    }

    public override object Calculate()
    {
        return (decimal)_count;
    }
}

The value to be aggregated is passed to the Update() method (in the case of count(*) the * evaluates to a non-null constant) and the aggregation adds this to the internal state.

Once all of the events have been processed for a time range, Calculate() is used to determine the final value of the column. It’s not hard to map count(), sum(), min(), max() and so-on to this model.

Things are a little trickier for aggregates that themselves produce a rowset, like distinct(), but the basic approach is the same. (Considering all aggregate operators as producing a rowset would also work and produce a more general internal API, but the number of object[] allocations gets a little out of hand.)

Once results have been computed for a time slice, it’s possible to iterate over the groups that were created and output rows in the shape of the QueryResult structure shown earlier.

There’s obviously a lot of room for optimisation, but the goals of a feature spike are to “make it work” ahead of making it work fast, so this is where things will sit while I move on towards the UI.

One more thing is nagging at me here. How do we prevent an over-eager query from swamping the server? Eventually I’d like to be able to reject “silly” queries at the planning stage, but for now it’s got to be the job of the query executor to provide timeouts and cancellation. These are sitting in a Trello card waiting for attention…

In the next post (Part 6!) we’ll look more closely at Seq’s API and finally see some queries in action. Until then, happy logging!

Read Part 6: Serving Data

Which logging abstraction should I use?

$
0
0

If you’re writing .NET apps or libraries for your own use, well, you can choose and use whatever logging facility you like.

If you want to write packages for others to consume, your best bet is to avoid API designs that call for logging at all. Strong invariants, exceptions and unsurprising behavior will inform users a lot more than log messages, which they may not always collect or see.

Some kinds of libraries, especially framework-style ones that call user code rather than the other way around, can’t make this simplification. If you’re writing the next NancyFx, Topshelf, or MassTransit, you’ll face the problem of how to write log events for developers to consume.

Applications get to choose the logging subsystem, and your library has play well with that.

LibLog1

In the past, we’ve dealt with this through abstractions like Common.Logging. The abstraction defines an interface, and a framework-specific adapter implements this in terms of whatever logging library is in use:

LibLog2

While logging abstractions have been a great asset to the ecosystem, they suffer some serious downsides.

First, abstractions target the lowest common denominator. Why do you choose one logging library over another? Often it’s because you value some power that its API affords. Abstractions usually have to wipe away these advantages in order to fit a variety of implementations.

Second, abstractions bring dependencies. A single-assembly or single-package library that supports a shared abstraction now becomes three or more dependencies in actual usage: the library, the abstraction library, the binding between the abstraction and the logging library, and the logging library itself. All of these need to be maintained and versioned.

Third, abstractions proliferate. Because of the first problem, someone’s always looking to create a better abstraction, and because of the second problem, this leaves applications peppered with implementations of multiple similar abstractions.

Given this unsatisfactory situation, most popular libraries end up defining an interface that they expose for the user to plug logging in. The library can define the abstraction at the level it considers “ideal”, and since the abstraction is part of the library at least some dependency proliferation is avoided.

Still, this isn’t a great place for applications to be. Libraries get to stay relatively “clean”, but now since every library potentially defines its own logging interface, the job of binding all of these and wiring the result up is a chore. Making matters worse, IntelliSense for `ILog…` in Visual Studio now presents a dozen suggested types to choose from. Every time. Ick.

Enter LibLog.

LibLog is not Serilog, so why am I writing about it? Well, it’s a bit cynical of you to ask 😉 but since you did: LibLog is different. Serilog is for apps. LibLog is built specifically for libraries, to address the pain points I mentioned above. I’m an admirer because we’ve lived with this problem for so long, I’d given up on an elegant solution ever emerging.

When you install the LibLog package, a source file is embedded directly into your project. There’s no third-party assembly or runtime dependency at all. The source file defines an ILog that is internal to your project: there’s nothing polluting your public API surface. The source file also includes some reflection-based code to sniff out which concrete logging library is actually in use, and some Expression-based code generation to efficiently connect to it.

LibLog3

All of this happens automatically, behind the scenes, without a user of your library having to know or care how things are wired up. It’s like magic! Its the future.

LibLog is also smart: you can use message templates with it to do structured logging a-la Serilog, Microsoft.Framework.Logging, Logary 4, and maybe even a future version of NLog.

using ClassLibrary1.Logging;

namespace ClassLibrary1
{
    public class Class1
    {
        public static void SayHello(int number)
        {
            var log = LogProvider.For<Class1>();
            log.InfoFormat("Hello, the number is {Number}", number);
        }
    }
}

Used in an application with Serilog, this event has the Number property and its value attached. If the underlying logger doesn’t support it, the message is translated into a regular .NET format string and passes through with full fidelity.

If you want something more from your logging interface – say, you want to be able to pass an integer event type with every message to cater to some new framework – it’s just a matter of tweaking the code file that’s already part of your app. Admittedly there’s got to be a common denominator, but you get to choose what it is, and it doesn’t have to be the lowest.

What are the downsides of using LibLog? Your library gains a few kilobytes of adapter code internally that need to be carried around. The weight of the code is comparable to what an external abstraction would cost, so it’s zero-sum in reality.

Why is LibLog the future? From where we stand right now, amidst a mature ecosystem of .NET software, no one – not Microsoft, nor you, nor me – can define the one logging abstraction to rule them all. We can’t put the genie back into the bottle! We have so much code between us in libraries and applications that all of the logging libraries we have, the abstractions, adapters and other gunk, will live on practically forever. Any new abstraction, no matter how clean, will just add one more to the long list we already drag along with us.

So, how about letting your library’s users off the treadmill? LibLog for libraries, Serilog (if you like it) for apps. Couldn’t have said it better myself :-).

To get started with LibLog check out the wiki or this brief introduction from Christos Matskas.

Serilog 2.0 Release Plans

$
0
0

.NET Core and ASP.NET Core are expected to release-to-web by the end of June 2016. There have been a few related announcments recently, surrounding the split of the runtime release (end of June) from the tooling release and return to MSBuild (tied to a later Visual Studio release).

This post is just a quick note to clarify what’s happening with Serilog.

What’s in 2.0?

As many of you noticed, we made an early leap into the DNX tooling and related libraries. Version 1.5 of Serilog, the most recent master release, has been sitting stable for quite a while now, while our long-lived dev branch has been tracking .NET Core for the last year, collecting changes and improvements in readiness for the new .NET.

Here’s what we’ve been focusing on in Serilog 2.0.

  • Packaging — the core Serilog package is now a lightweight, low-dependency package with core types like Log, ILogger and LoggerConfiguration; sinks, enrichers, and facilities like <appSettings> support have moved out.
  • Portability — much more of Serilog now works across the board; for example, we can support LogContext on all target platforms; there’s no more Serilog.FullNetFx.dll.
  • Usability — a few rough spots in the API, especially concerning logger shut-down & disposal, have been cleaned up.
  • Performance — for example Serilog.Sinks.PeriodicBatching now accepts and enqueues writes without lock contention.
  • IntegrationSerilog.Extensions.Logging is a high-fidelity logging provider for ASP.NET Core that gets structured log data from your apps and the platform out to a huge variety of mature and supported sinks of all kinds – databases, log servers, the cloud, rolling files, …

There are plenty more goodies in there. (One of my favourites is this PR which landed support for fixed-length level names. In 2.0 you can use {Level,3:u} in output templates to get output like DBG, INF, WRN and so-on.) When we flick the switch on the final version we’ll publish full release notes.

What do the recent announcements mean for Serilog?

By far the biggest change has been the packaging work. Serilog used to include its own file, console and trace sinks, for example. These now live in their own repositories and packages: Serilog.Sinks.RollingFile lives in here on GitHub for example. This doesn’t just reduce dependency bloat – it’s also an important milestone in scaling out the Serilog project to support growth and community contributions (like this excellent alternative rolling file package which can now stand on an equal footing with the original).

Re-packaging Seriog has meant a lot of grappling with the in-progress tooling. DNX + DNU (soon dotnet) and project.json have brought game-changing improvements for us – especially the package building capabilities that mean a single build can compile (and then test!) for multiple target platforms. The earlier build system we’ve used to emulate this in Serilog 1.5 has been miserable. I’m sad to see project.json go, but happy that the major benefits are being carried through to the final version (even if we’re stuck with XML). I’m not thrilled about going through another round of churn, but I trust the .NET and ASP.NET engineers in their commitment to deliver a fully-baked, high-quality tooling release so however we have to get there – we’ll get there!

Roadmap to 2.0

Anyway, I digress 🙂 . Here’s how things will go for Serilog:

  1. We’re going all-in on RC2. It’s going to be a close, if not identical, reflection of the RTM runtime, so once it drops the Serilog core and as many sinks as possible will be ported. This also means nailing down the target framework monikers that the packages support, and paring back our declared dependencies to the bare minumum. The Serilog package will move to 2.0.0-rc-* versioning to reflect this.
  2. Serilog 2.0 will RTW as soon as .NET Core does, in June. The Serilog package will drop first (ASAP) followed by the supported sinks in roughly the order of usage.
  3. We’ll continue building/shipping Serilog with the RC2 version and later previews of the project tooling until the Visual Studio release when we can move forwards with it. From RC2, we’ll treat the tooling as if it’s RTM, and make sure Serilog and its sink/enricher/configuration packages work flawlessly with it, as well as the released Visual Studio 2013 and 2015 tools.

The Serilog contributors and community have put in a lot of work to get us to where we are today. Serilog’s the most mature and full-featured logging solution you’ll find on .NET Core, thanks to a huge effort tracking the platform changes to date. When .NET Core hits RTM, Serilog is going to be there, as we’ve always planned! I can’t wait for June.

When should I upgrade?

RC2. We’ve treated RC1 as a “best-effort” target so far, and supporting Serilog 2.0.0-beta-* has focused mostly on unblocking early adopters of the new runtime. From RC2 I’ll personally consider .NET Core a supported platform and pursue any Serilog issues on it as if they’re in an RTM. I think the promise of a .NET Core/ASP.NET Core RTW shortly afterwards will mean a lot of Serilog users and contributors will be moving forwards, so Serilog 2.0.0-rc-* is going with us.

How can I help?

If your favorite Serilog sink isn’t ported to .NET Core yet, or isn’t ready at RTW, please consider how you can chip some effort in – most porting work is a straightforward project file change and recompile. There are a lot of people in the Serilog community ready to help out; the best way to get started is to jump onto Gitter and mention you’re interested in sending a PR.

Structured logging concepts in .NET Series (1)

$
0
0

Between now and the Serilog 2.0 release that’s due sometime in July, I’ve decided to go back to basics and write some posts on the fundamentals of modern structured logging in .NET.

In the complex, distributed, asynchronous systems being built today, answering questions about behavior – what caused this issue? – is a horribly tedious thing to have to do without the right instrumentation. I hope this series will be useful for anyone wanting to build more manageable .NET apps with the new crop of frameworks and tools.

The examples use Serilog, but most of the same concepts appear in other libraries, so where possible I’ll call out the alternative APIs.

A tentative plan of attack, following this post:

  • Events and levels
  • Message templates and properties
  • Event types
  • Context and correlation
  • Serialized data
  • Collection and storage

But now, let’s get the ball rolling!

What is structured logging?

Approaches to application diagnostic logging can be roughly divided into text-based logging and structured, or event-based, logging.

Text logging, sometimes also called ‘printf debugging’ after the C printf() family of functions, is the first one most programmers encounter. As your code executes, it prints lines of text out to the terminal that describe what’s going on.

12:23:10 [INF] Service "loyalty-card-printer" starting up on host 8fd342hkg22u
12:23:11 [INF] Listening on http://8fd342hkg22u.example.com:1234
12:23:20 [INF] Card replacement request received for customer-109972
12:40:55 [INF] Card replacement request received for customer-98048

Because text logs are written as a narrative in natural language, they’re nice and easy to read.

Text is a very expressive format, and tools including full-text indexing and regular expressions can do a lot with it, but answering anything non-trivial with text log data can still take some painstaking work.

Structured logging is an umbrella term for alternatives focusing on easier searching and analysis at scale. A structured log event, rather than being a line of text, is conceptually a timestamped set of name/value properties.

Structured logging systems separate the way the event is captured (the logging API) from the way that it is transported (perhaps JSON) and how it is rendered in a user interface or file.

A rough translation of the sequence above to structured events might be conceptually like:

time=12:23:10, level=INF, service=loyalty-card-printer, host=8fd342hkg22u, activity=started
time=12:23:11, level=INF, endpoint=http://8fd342hkg22u.example.com:1234, activity=listening
time=12:23:20, level=INF, customer=customers-1099972, activity=replacement
time=12:40:55, level=INF, customer=customer-98048, activity=replacement

Structured logs are easier to navigate because the properties in the event are first-class: a log server can return results for a query like activity == "replacement" && customer == "customer-98048" without any configuration to describe how the events should be parsed.

The line between structured and text logs is blurry:

  • Text logs can provide some structure by encoding well-known fields like the timestamp, logging level and thread id at the start of each line, and by including name=value formatted fields in the message itself
  • Structured logs sometimes look a lot like text logs when formatted out to a flat file or the terminal, and often include a message field that is itself free text

But, these similarities are skin deep. Structured logging and text logging require different mindsets in order to use each well.

Historical barriers to adoption

Structured logging isn’t always 100% rainbows-and-unicorns-awesome. My guess is that two warts have kept adoption of structured logging in second place.

First, structured logging implementations can be more labor-intensive at development time than text-based alternatives.

In .NET, the EventSource API illustrates this well. Writing a structured log event with EventSource requires an event definition in the form of a C# method in a separate class, along with the line of code to actually record the event:

[Event(123, Message = "Card replacement request received for: {0}")]
internal void CardReplacementRequested(string customerId)
{
  if (this.IsEnabled()) this.WriteEvent(123, customerId);
}

// Elsewhere:
// eventSource.CardReplacementRequested(customerId);

Switching between files and creating an annotated method just to record a single log event is a tough cost to bear, in comparison with the single line of code to record a typical text log event.

Second, many structured logging implementations produce output that is harder to visually scan. This phenomenon is exaggerated in the earlier example – there are nicer ways to format the same structured events – but it’s still very hard to complete with natural language for readability.

The topic of this series is the most recent structured logging approach for .NET, specifically designed to avoid these issues.

Logging with message templates

Since the technique was introduced by Serilog a few years ago, many .NET applications have adopted a style of logging that captures a structured event, as well as a human-friendly text representation of it, in the same line of code.

// var customerId = "customer-109972";
Log.Information("Card replacement request received for {CustomerId}", customerId);

The description of the event, called the message template, uses named ‘holes’ instead of the anonymous {0} and {1} numeric placeholders found in .NET format strings.

Under the hood, instead of immediately substituting the value of customerId into the template to create a text event, the logging library captures the template and the argument values individually. By matching the named holes in the template with the positional arguments, the customerId value is associated with a property called CustomerId on the event.

You can imagine the resulting event in JSON format might look like:

{
    "time": "2016-06-01T02:23:10Z",
    "level": "INF",
    "template": " Card replacement request received for {CustomerId}",
    "properties": {
        "customerId": " customer-109972"
    }
}

The event can be rendered as text using the template and properties, or stored and searched as structured data, achieving the best of both worlds: events that are optimally human-friendly and optimally machine-readable.

At the time of writing, you can use message templates in .NET with Serilog, Microsoft.Extensions.Logging (ASP.NET Core) and Logary.

(I actually heard the term “modern” used to describe this style of structured logging via Jason Roberts’ Pluralsight course on Serilog and Seq, which is a couple of versions behind but still well worth checking out if you have a subscription.)

It sounds like I’ve told the whole story of modern structured logging already with this post! One line of code is enough to convey the basic idea, but there really is more to know if you want to use it effectively. I’m aiming to fill two installments of this series each week, so if all goes well you won’t have to wait too long for the next post on events and levels….


Events and levels – structured logging concepts in .NET (2)

$
0
0

Introductions are tough, I’m glad that’s out of the way. This post is going to dig straight into events and levels. If concepts like events, timestamps and logging levels are old news for you, this might be one post in this series to skip. Saying a few words about these things up front will make it easier to dig into the nitty-gritty later though, with everyone on the same page.

We instrument a complex applications with logging so that when things go wrong (or when we need to verify that things are going right!) there’s a trail of events that have taken place in the system, for us to inspect.

Thinking about system events as, well, events is fairly natural: something happened in the system at a particular time and needs to be recorded. But, if you’re coming from a purely text-based logging mechanism (perhaps just writing logs to System.Console then it’s a good time now to pause and consider that events are truly a first-class concept in structured logging.

At a minimum, an event is a timestamp and some data, usually comprising a list of named properties.

EventsTimeline

The timestamp is recorded automatically when a log event is captured, so despite being one of the most important elements of an event, it doesn’t show up much in logging code.

To make real-world logging systems convenient to work with there are other special properties, common to all events, that are given additional meaning by the system.

SpecialProperties

Among these special properties you’ll usually encounter a message of some kind, but how exactly that works in Serilog and similar libraries is the topic of the next post, so we’ll park that here and just consider messages to be simple strings.

Beside the timestamp and message, the level (or ‘severity’) of an event is the next thing you’ll find. This is so familiar and ubiquitous that it’s worth discussing what levels are, and are not, in structured logging systems.

Application logging is a balancing act, constrained by the performance cost of producing events, the infrastructure cost of transmitting and storing them, and the attention cost of someone reading or being interrupted by them.

In traditional logging, where the level is one of the few first-class concepts apart from the message, it’s the basis for controlling most of these cost trade-offs. When events are recorded, they’re given a level like debug, warning, or error.

Log.Debug("Checking for new mail");
Log.Information("Setting new pricing");
Log.Warning("Incoming projectile detected");
Log.Error("Could not connect to configuration server");

If you want to follow along and experiment in Visual Studio, now’s a great time to follow the Getting Started instructions on the Serilog wiki to create a basic console application with Serilog configured.

Levels are arranged in a hierarchy from least to most important, and the application is configured to generate events at a certain level or higher. Logging statements below the minimum level are no-ops.

This role – maintaining application performance by reducing log volume under normal operation – is an important function of the level even in apps that use structured logging.

In Serilog, the default level is Information, so the first event above won’t be emitted. The MinimumLevel option can be used to override this:

Log.Logger = new LoggerConfiguration()
    .MinimumLevel.Debug()
    .WriteTo.LiterateConsole()
    .CreateLogger();

Besides using levels like debug to control verbosity, the error level may flag something that should be fixed, and the warning level flag something to investigate, but operationally, levels take a back seat once structured data comes into the picture. Levels might be considered for log retention, and also when generating alerts and notifications, but other application-specific properties often play as much or more of a role in these scenarios.

My rule of thumb, when in doubt, is that if it’s worth recording in production an event is Information. The Serilog wiki has a little more information on the levels provided there.

Custom levels, a common feature of older libraries like the log4* family, exist because without rich structured data to examine, the level field has to take on a lot of responsibility. Instead of having a less-important-error level for exceptions in a non-critical subsystem, and a more-important-error level for another code path, properties of a structured log event describe the context of an error and can be used to make decisions about what to do with it.

With the basic terminology covered, I’m looking forward to discussing message templates and properties in the next post, where modern structured logging starts to diverge from what most of us are familiar with.

Message templates and properties – structured logging concepts in .NET (3)

$
0
0

This post is the third in a series; start here.

Trojan eventing

A long time ago I encountered a library called Bamboo.Prevalence, by the amazing Rodrigo “Bamboo” de Oliveira. Following Java’s Prevayler, the idea underpinning object prevalence is that rather persisting your application’s object model as a point-in-time snapshot, a log of commands affecting the model (Add item 5 to order 1!) is written instead. Later, to recreate the model at a point in time, the application can just replay those commands from the beginning to arrive at the desired state.

I was fascinated by the idea of object prevalence. It turned the whole notion of application state from a static, opaque thing into a dynamic and detailed view of the application’s history.

Prevalence never really took off in .NET because of the practical challenges associated with data migration, reporting, integration points and so-on. Existing infrastructure expected a SQL database, and moving away from that model would have necessitated “rewriting the world”.

Still, a little later on I encountered this idea again in the guise of event sourcing, an architectural pattern based on a similar application-state-as-transaction-log model.

As with object prevalence before, I found event sourcing very interesting, but as with object prevalence again, the divide between the architecture and the status quo in typical enterprise software environments kept it out of most projects I was involved with.

Events are exciting because they capture a whole dimension that’s otherwise invisible, but equally if not more important than the static view that the current system state describes. In .NET, messaging architectures sometimes capture an event-oriented view of an application, and Reactive Extensions (Rx) has had some success with getting a first class treatment of events into .NET apps, but events are still usually a secondary, transient consideration and hence a lost opportunity for diagnostics and analysis.

Eventually, down the line, I stumbled across what you might call Trojan eventing, a sneaky way of grafting a fully-structured, persistent, queryable event stream onto an application, without a change of architecture and in most cases, without many of your teammates even noticing! Hey, your app right now might be producing a first-class event stream! To help you recognize it, this post describes what it looks like 😉

Capturing vs. formatting log event data

The classic workflow associated with text logging goes something like this:

  1. Take the description of an event and associated parameters
  2. Format it into human-readable text
  3. Collect the text
  4. Parse the text to recover individual parameter values
  5. Filter based on parameters
  6. View matching events

If you break this down, the formatting step (2) actually achieves two things. The event is made into something human readable (the formatting part), but by serializing the whole thing as text, the associated parameters are also being captured in a form that can survive outside the application.

Step 4, where we need to expend effort to recover information from the event for machine processing, comes about because we threw all of the structural information away at step 2. This is painful because practically every interesting thing we can do with a log event revolves around the specific parameter values associated with the event. Writing regular expressions to extract those values gets tiring quickly.

To fix this, structured logging approaches data capture and formatting separately. In the structured logging workflow, capturing happens first while formatting an event into human-readable text is deferred until it’s needed later on, after any property-based filtering has been applied:

  1. Take the description of an event and associated parameters
  2. Capture the description and parameter values
  3. Collect a serialized representation like JSON
  4. Filter based on parameters
  5. Format into human-readable text*
  6. View matching events

*Formatting can be deferred, but it doesn’t necessarily have to be – the collected JSON can include the human-readable representation alongside the template and parameters, if it makes processing easier in subsequent steps.

What this all amounts to is that the log statements already in your application:

Log.Information("Checked out order {0} for ${1:C}", order.Id, order.Total);
// -> Checked out order o12345 for $15.30

Can be converted into fully-structured, first-class events with almost insignificant effort:

Log.Information("Checked out order {OrderId} for ${Total:C}", order.Id, order.Total);
// -> Checked out order o12345 for $15.30
// -> { … "OrderId": "o12345", "Total":"15.3" … }

The message template "Checked out order {OrderId} for ${Total:C}" names the parameters of the log event (and in some cases describes how they are captured) so that they can be represented as properties in a structured format like JSON, and easily manipulated without the need for free-text parsing.

Message template basics

Message templates are a superset of .NET format strings. That means every valid .NET format string is also a valid message template.

Log.Information("Checked out order {0} for ${1:C}", order.Id, order.Total);

The property names captured in the example above will be 0 and 1.

While numeric property names are a useful compatibility feature when moving code from a format-based logging implementation like log4net, descriptive names are much more pleasant to work with. Message templates extend format strings by allowing textual names to be used instead of numeric placeholders.

In a message template with named placeholders, the parameters specified alongside the template are always matched from left to right, that is, the two properties First and Second will have the values 1 and 2:

Log.Information("Capturing {First} and {Second}", 1, 2);

When writing a message template, either numeric or named placeholders must be used. While libraries like Serilog will try to do something reasonable when confronted with a mix of named and numeric placeholders, doing so doesn’t make much sense.

This is because when capturing, named placeholders are matched strictly left-to-right with the arguments, while numeric placeholders in .NET format strings allow out-of-order capturing and repeated substitutions of the same index.

Because of their relationship to standard .NET format strings, width and format specifiers are also supported in message templates.

Log.Information("Capturing {First,8} and {Second:x8}", 1, 2);
// -> Capturing        1 and 00000002

(To fully support deferred rendering, both the raw property value and its formatted representation need to be captured up front in this scenario.)

Likewise, braces in the message itself can be escaped by {{ doubling them.

Validating message template usage

SerilogAnalyzer is a Roslyn-based Visual Studio plug-in by Robin Sue that validates message template syntax and checks binding to parameters.

Usage issues result in a red squiggly, and an error or warning in the Visual Studio error list.

SerilogAnalyzer

You can download and install the Serilog Analyzer VSIX package from the GitHub project page.

Capturing structure

The other extension that message templates add to typical format strings is control for how parameters are captured.

By default, a parameter logged alongside a message template will be converted into a simple scalar value like string or int. This matches our intuition for how format strings are supposed to work: if you provide a complex object, say, a BlogPost instance, it’ll be logged via its ToString() representation:

var post = new BlogPost("Structured Logging Concepts")
{
    Tags = { "logging", "diagnostics", "analytics" }
};
Log.Information("Publishing {Post}", post);
// -> Publishing Structured Logging Concepts

This is a very limiting: later when searching the log, it would be useful to be able to find, say, all published posts that are tagged “diagnostics”.

By prefixing an @ sigil ahead of the property name, the post object will be captured in a structured representation instead.

Log.Information("Publishing {@Post}", post);
// -> Publishing BlogPost { Name: "Structured Logging Concepts", Tags: "logging", … }

If the destination format is JSON, Post will be serialized as an object with detailed Name and Tag properties. (Serializing logger — so now you know 😉 )

Up next…

We’ll see how complex data is captured in more detail in the upcoming post on serialization. Up next we’ll consider event types, the final detail necessary before we can call the discussion of message templates complete.

Reading logger configuration from appsettings.json

$
0
0

ASP.NET Core has a completely new configuration subsystem that adds JSON, environment variables and other sources in addition to the classic XML format we’ve had since .NET 1.0. The appeal of Microsoft.Extensions.Configuration for me is that it’s easier to work with than the System.Configuration APIs.

Having the flexible JSON format is also a big plus, and it makes Serilog configuration on .NET Core quite a bit more natural than the XML `` support that ships in 1.5.

This post is about Serilog 2.0, currently in beta. Unless you’re targeting ASP.NET Core, you probably need the Serilog 1.5 package. But, all of this stuff will land later this year, so read on!

Here’s the Serilog configuration for colored console logging:

Log.Logger=newLoggerConfiguration().WriteTo.ColoredConsole().CreateLogger();

With the new Serilog.Settings.Configuration package, ReadFrom.Configuration() lets us pull the same information from appsettings.json.

varconfiguration=newConfigurationBuilder().AddJsonFile("appsettings.json").Build();Log.Logger=newLoggerConfiguration().ReadFrom.Configuration(configuration).CreateLogger();Log.Information("Hello, world!");

In the JSON file itself, the relevant section is called "Serilog":

{"Serilog":{"WriteTo":["ColoredConsole"]}}

Simple, hey? As long as the sink package, in this case Serilog.Sinks.ColoredConsole is installed, the right assemblies will be found. The names of sinks (and the arguments they might accept) are the same as the method names in the C# API, because behind the scenes the JSON configuration is translated into exactly the same method calls as you’d make directly in C#.

Let’s look at a more complete example:

{"Serilog":{"MinimumLevel":"Debug","WriteTo":[{"Name":"ColoredConsole"},{"Name":"Seq","Args":{"serverUrl":"http://localhost:5341"}}],"Enrich":["FromLogContext","WithMachineName","WithThreadId"],"Properties":{"Application":"Sample","Environment":"Test"}}}

This one specifies a minimum level, and shows how "WriteTo" works to specify more complex sink configuration. Instead of just the name of the sink some arguments are included too. Seq is one of the network-based sinks for Serilog, and WriteTo.Seq() accepts a serverUrl argument in the C# API, so that’s passed under "Args" in the JSON version.

"Enrich" works the same way as "WriteTo": the options here are the same ones that you’ll see on the Enrich configuration API in C#, and arguments can be passed in the same way if they’re needed.

As a syntactic shortcut, Enrich.WithProperty("Application", "Sample") in C# is specified using the JSON "Properties" collection. (This has the side-effect of meshing nicely with environment variable configuration support, which we get “for free” by supporting the IConfiguration interface. More on that another day…)

What do you think? Let me know in the comments here, or join the Gitter chat we’ve just opened up for the Serilog project.

Exploring Rust

$
0
0

For the past few years I’ve followed the evolution of Rust with interest. Rust is a new systems programming language (like C or C++) with greater emphasis on safety. I’ll spare you my reasons and opinions on its relevance since the Internet is replete with those :-)

Instead, I thought I’d write this brief post to share my perspective on learning Rust as a developer who’s otherwise mostly writing C#. The post assumes you’ll jump over and check out the Rust documentation where topics are necessarily summarised here.

No, this isn’t one of those “I’m leaving .NET! …for Rust!” kinds of posts. Rust and C# are different enough to be complimentary, in much the same way C# and JavaScript are. Learning Rust also involves digesting concepts that have (so far) escaped C#’s all-consuming multi-paradigm expansion. Perhaps there are some ideas to bring back and apply in .NET, too?

So, you’re comfortable with C# - how does the knowledge you already have map onto this new thing? Here are my notes after spending a few days writing some Rust.

Projects, builds and compilation

The first thing you’ll probably write in Rust is not in the language itself, but at the command-line:

> cargo new --bin hello
...
> cd hello
> cargo run
...
Hello, world!

Cargo is Rust’s package manager, and combined with the repository at crates.io, it plays the role of NuGet in the Rust ecosystem.

Unlike C# and .NET, Rust came of age in the post-NPM world. The Rust standard library is smaller than the original .NET one, and feels a lot more like the upcoming .NET Core. HTTP clients, web frameworks, async I/O and JSON serialization live out in the crates.io ecosystem where they can evolve and improve through Darwinian pressures.

You’ll also probably note that NuGet.exe doesn’t have a run command like the one above. This is another parallel between Rust and .NET Core: the cargo command-line (like dotnet) provides a standardised interface to the whole toolchain including dependency management, builds, a test runner and so-on.

Testing

In C#, adding unit tests to a project is a decision point. In Rust, the decision is made for you, and every (Cargo) project gets testing built in. Tests sit right alongside the code being tested, often in the same source file.

Here’s an example:

fnadd(a:i32,b:i32)->i32{a+b}#[test]fntwo_plus_two_makes_five(){letresult=add(2,2);assert_eq!(result,5);}

Side-by-side tests are not just a feel-good novelty: including tests in the same compilation unit as the code-under-test means there are no visibility issues to worry about. No need for internal and [InternalsVisibleTo] just to drive unit tests!

Code and its tests also always move together - no more parallel folder hierarchies in test projects to keep in sync. I like this enough to miss it in C# projects already. Conditional compilation in Rust means there’s no issue with code or dependency bloat, something that might be trickier to handle cleanly in the .NET world.

The baked-in assertion macros are very basic - I’m sure there are alternatives but in my limited exploration so far I haven’t felt the need for anything more.

Syntax

Rust has an expression-oriented curly-braces syntax. If you stare at C# (or C/C++, or Java, or JavaScript) all day, Rust isn’t that surprising. It’s not a language you’d immediately recognize by its syntax, like Python, Lisp or Smalltalk.

Coming at Rust from C# you might find the name: Type declaration order, -> Type return type declarations and implicit “return” unusual, but the tiny add() function above uses all of these and the meaning is pretty clear.

Approaching the Rust syntax is very much like approaching the syntax of any new language - it takes some time. So far nothing has left me scratching my head for too long.

Move semantics, lifetimes and borrowing

Here’s where Rust begins to diverge from “nice, generic, modern language” to strike out on its own. Two of Rust’s big selling points - memory safety without garbage collection, and freedom from data races - rely on statically determining the “owner” of an object as well as possible references to it (aliases).

letc1=vec![1,2,3];letc2=c1;// Fails to compile - the value has been moved out of c1 into c2
println!("Count: {}",c1.len());

You get the general idea in the example above: assignments in Rust are moves by default. This takes a bit of getting used to: in C#, assigments are always copies either of a simple value or a reference to an object, so after assignments in C#, both the source and destination variables are still usable.

For memory safety, the advantage of move semantics is that the compiler knows exactly when a value goes “out of scope” and can insert code to deallocate it at precisely the right moment, without needing a GC to scan for other references.

Now, this is obviously a pretty limited model, so Rust has a clever system of lifetimes and borrowing to make more complicated patterns expressible. A proper treatment of the idea is well beyond what I could attempt in this post (already starting to spill beyond the initial goal of “brief”), but with the risk of inadvertently bungling the analogy I’d like to try to translate the concept of a “lifetime” into C# terms, since I haven’t seen this attempted elsewhere.

Ready? Read the ownership, borrowing, and lifetimes chapters of the Rust book? Ok, here it is.

Dispose-safety in C

Let’s imagine a typical use of the using statement in C#:

using(vararchive=OpenZipArchive(@"Cbackup.zip")){using(varindex=archive.OpenFile("index.json")){varcontents=ReadBackup(archive,index);// Read files from contents
}}

It’s a bit contrived, but you can imagine the object returned from ReadBackup(archive, index)may have a reference to either archive or index.

That’s fine in the example above, since we know at the time we // Read files from contents that nothing has been disposed yet, so all is well.

How about this code?

using(vararchive=OpenZipArchive(@"Cbackup.zip")){BackupReadercontents;using(varindex=archive.OpenFile("index.json")){contents=ReadBackup(archive,index);}// Read files from contents
}

Here, we might have a problem. The lifetime represented by the second using statement has finished, and index has been disposed. If the BackupReader holds a reference to index then we’ll have trouble, probably in the form of an ObjectDisposedException when we try to use it.

How can we tell if this code is safe? In C#, we can’t - statically - without examining the code of ReadBackup() and potentially any code it calls.

The analogy is not precise, since in Rust it’s deallocating the memory rather than calling Dispose() that we’re concerned with, but otherwise, this is the safety issue Rust lifetimes solve. The Rust compiler must know, when examining this code, whether the second usage of contents is valid, because allowing it otherwise could lead to reading from memory that’s already been freed.

Before translating the example into Rust, let’s invent a hypothetical dispose-safety checker for C#, where we can represent these kinds of things in method signatures using attributes. Here’s the original:

staticBackupReaderReadBackup(Archivearchive,Indexindex){varmanifestFilename=index.GetManifestFilename();returnnewBackupReader(archive,manifestFilename);}

Ah - so the index argument is only used to determine a filename, and the result only depends on index. Here’s the “dispose-safe C#” that expresses this:

[return: In("a")]staticBackupReaderReadBackup([In("a")]Archivearchive,[In("b")]Indexindex){varmanifestFilename=index.GetManifestFilename();returnnewBackupReader(archive,manifestFilename);}

We’ve added some labels like [In("a")] to the arguments. These represent which using block the value came from. “Dispose-safe C#” requires these whenever a method accepts an IDisposable argument.

Annotating the return value with [In("a")] means that the result is valid in the same scope that the archive parameter is valid in - both are tagged with a.

The return value doesn’t have any relationship at all to index, so the b tag doesn’t appear anywhere else.

Back at the call-site, the checker does the mapping of labels to using blocks for us implicitly:

// 1
using(vararchive=OpenZipArchive(@"Cbackup.zip")){BackupReadercontents;// 2
using(varindex=archive.OpenIndex())){// Infers that "a" is block 1 and "b" is block 2
contents=ReadBackup(archive,index);}// Thus, contents depends only on block 1, so this is fine
// Read files from contents
}

Now, what would have happened if ReadBackup() had a different implementation?

[return: In("b")]staticBackupReaderReadBackup([In("a")]Archivearchive,[In("b")]Indexindex){varfiles=archive.LoadAllFiles();returnnewBackupReader(files,index);}

In this version of the method, the returned BackupReader is declared to depend on b, the lifetime of the index argument. The mapping of the using statements is the same, but the code is no longer valid and the checker will reject it:

// 1
using(vararchive=OpenZipArchive(@"Cbackup.zip")){BackupReadercontents;// 2
using(varindex=archive.OpenIndex()){// Infers that a is block 1 and b is block 2
contents=ReadBackup(archive,index);}// Error: contents depends on block 2, which has ended
// Read files from contents
}

By giving the checker a bit more information about the intentions of the code, it’s helpfully saved us from an ObjectDisposedException at runtime — awesome!

Lifetime variables in Rust

I hope this little “dispose-safe C#” thought experiment sheds some light now on the corresponding (memory-safe) Rust:

fn read_backup<'a,'b>(archive: &'a Archive, index: &'b Index) -> BackupReader<'a> {
    let manifest_filename = index.manifest_filename();
    BackupReader::new(archive, manifest_filename)
}

Don’t let the reuse of the angle-brackets “generic” syntax throw you. The little 'a and 'b annotations in there are lifetime variables, exactly the same as the [In("a")] and [In("b")] examples in C#.

At the call-site, there’s no special using-like statement to delineate lifetimes in Rust, which are implied by the natural boundaries of blocks:

// 1
letarchive=open_zip_archive("C\backup.zip");letcontents:BackupReader;// 2
{letindex=archive.open_index();contents=read_backup(&archive,&index);}// Read files from contents

When the compatible implementation of read_backup() is used, this compiles fine; when the second implementation of read_backup() is used here, the Rust compiler rejects the code with error: index does not live long enough.

I hope my analogy hasn’t stretched too far and led you astray, as my own understanding of Rust is far from perfect. My hope is that it makes the whole lifetime concept a bit more approachable. Ownership, borrowing and lifetimes are interesting concepts that it are worth spending some time on.

The stack and the heap

There’s one more place I’ve spotted where C# and Rust differ fundamentally, and that’s how memory is allocated between the stack and the heap.

C# has two families of data structures that dermine allocation behaviour: structs and classes. If a type is a struct then values of that type will be allocated on the stack:

// Here, now is the actual DateTime value
varnow=newDateTime(2016,03,23);

If a type is a class, then values of that type are allocated on the heap, and we work with it through references:

// Here, user is a pointer to some memory holding a User
varuser=newUser("nblumhardt");

C# also has an “escape hatch” for moving structs to the heap as well, in the form of ‘boxing’:

// Here, now is a pointer to memory holding the DateTime
varnow=(object)newDateTime(2016,03,23);

Boxed structs in C# however are a bit of a second-class citizen - there’s no static type to represent them, so they end up as objects.

After the last section, you might feel some relief to hear that Rust is simpler on this point. In Rust, all values are essentially C# structs:

// Here, user is the actual User value
letuser=User::new("nblumhardt");

To move a value onto the heap, as in C#, the value is boxed:

// Here, user is a Box, pointing to memory on the heap
letuser=Box::new(User::new("nblumhardt"));

Boxed values are first-class in Rust, so in practice they’re more like C#’s classes in usage, enjoying the benefits of static typing for example.

The declaration and construction syntax for structs in Rust is quite a bit different from the C# equivalent, but I’ve found that getting clear about the differences in the allocation model has made the biggest difference to my understanding of Rust code.

Language machinery

Rust is a modern language. Its designers have drawn widely from existing languages and experience to exclude the worst footguns (nullable references, mutability-by-default) and include the best available tools for abstraction (closures, generics, traits, pattern matching, algebraic data types and more).

One of the refreshing aspects of Rust is that like C#, it’s a multi-paradigm language with a functional features. Many of the new languages I’ve encountered in recent years are presented as functional languages, which is nice enough, but there’s a place in the future for variety.

I’d love to continue poking at different corners of Rust with this post, but writing it’s already pushing at the limits of my attention, so before I lose your interest too I’d better wrap up.

In closing

I’m having a great time exploring Rust. If you’re working in one of the currently popular industrial languages like C#, and want to learn something that’s different in more than just syntax, Rust could be what you’re looking for. Have fun!

Serilog Timings

$
0
0

TL;DR: I’ve just kicked off development of Serilog Timings, a tiny package that adds streamlined operation timings on top of Serilog.

Though there are a million ways to collect timing data from applications, it’s often very useful to include it in regular logs:

  • Logs are visible all the time, even in development, while there isn’t always a metrics server handy

  • Logs make it easy to trace individual timings back to the conditions and processes that caused them

  • Structured logs events already carry the contextual properties around which the timings can be sliced and diced

It will come as no surprise that there are already a few options for recording timing information through Serilog. At its most basic, System.Diagnostics.Stopwatch and a handwritten log event can suffice. I’ve used SerilogMetrics, which includes timings along with counters and gauges, and I’ve also written several hand-rolled implementations for various apps.

As I’ve done this, the picture of what I want from operation timings has become a lot clearer:

Limited noise - when timings are recorded, it’s easy to end up with a crowd of log events: one declaring the operation started and assigning it an identifier, one describing the operation with a bunch of app-specific data, and finally the event completing operation with the elapsed time. Serilog Metrics was built with the goal of compressing all of this into a single event.

Natural messages - related to the first goal, I don’t really want generic “operation started” and “operation ended” messages in the log. Logs need to be optimized for human pattern recognition - the message printed by one operation should be visually distinct from any other so that they’re easy to pick out from a large body of text.

Stable event types - furthermore, the event type (essentially a hash of the message template) should be stable for a particular operation, regardless of whether the operation succeeded, failed, or completed with a custom result. Stable (and unique) event types make it much easier to find all instances of an operation, even though “successful” ones may vastly outnumber and hence obscure the unsuccessful ones.

A compact syntax, .NET Core support, LogContext integration, custom results, … okay, I’m particular in the extreme … you kinda get the picture :-)

I won’t promise it’s the last attempt I’ll ever make to cover all of this, but Serilog Timings ticks all of the boxes right now. Here’s how it’s used.

Once you’ve configured Serilog and installed the NuGet Package, wrap the code you want to time with a using block and describe the Operation:

using(Operation.Time("Submitting payment for {ClaimId}",claim.Id)){// Timed code goes here...
}

When the using block is completed, an event will be written to the log that looks like this one:

11:20:01 [INF] Submitting payment for claim-41546 completed in 11.2 ms

The description of the operation is a regular Serilog message template, which Serilog Timings extends with " {Outcome} in {Elapsed:0.0} ms". The properties from the description, plus the Outcome and Elapsed properties, are all included on the event if it’s sent to a structured log collector.

The completion event and any other events written within the block will be tagged with an OperationId property as well, if the Serilog LogContext class is enabled.

Operation.Time() assumes the operation always succeeds, so the message will be identical regardless of whether or not an exception is thrown.

Operations that need to distinguish between timings for successful completion vs failures use Operation.Begin() and Complete():

using(varop=Operation.Begin("Saving invoice {Filename} to blob storage",filename)){// Timed code goes here...
op.Complete();}

Successfully completing the operation will produce an event exactly like the one in the first example. If Complete() is not called on an operation started with Begin(), perhaps because an exception was thrown, then it will be logged as abandoned:

11:29:37 [WRN] Saving claim-41456 invoice INF_CX7890.pdf to blob storage abandoned in 30978.8 ms 

Notice that although the Outcome property is "abandoned" instead of "completed", and the level is warning, the message template (and thus the event type) is identical to the one from the success case. This means that while completed and abandoned timings can still be differentiated, given one of them it’s always simple to find the other.

To bail out of an operation altogether and suppress the warning message, an operation can be Cancel()ed.

Complete() can optionally accept a result property name and value to added to the event, so for example if you’re timing a network request, you can tag the event with BytesReceived, running a SQL query you might tag the event with Rows and so-on. More detailed instructions and documentation are in the project README.

Events generated by Serilog Timings display nicely in log files or the console. If you collect logs using SQL Server or another database you might elevate the Elapsed property to column-level for some interesting analysis opportunities (the Serilog sink for SQL Server recently acquired property-to-column mapping capability).

This screenshot from Seq shows a timing event from my test app with all of its attached information:

Data

The stable event type (B7B24E3B shown as Type in the screenshot above) lets us drill down to just the timings of the operation we’re interested in:

ByType

And once we’ve pushed this over into a signal we can run more interesting queries to follow the Elapsed value over time:

Aggregate

(Seq 3.1 just added the timeseries charting shown here, so my current obsession is trawling through historical log data to see what kinds of trends I can detect ;-) .)

What do you think? I’d love to hear from you if you give Serilog Timings a try, and issue reports or PRs to the GitHub project are welcome. Have fun!

Which logging abstraction should I use?

$
0
0

If you’re writing .NET apps or libraries for your own use, well, you can choose and use whatever logging facility you like.

If you want to write packages for others to consume, your best bet is to avoid API designs that call for logging at all. Strong invariants, exceptions and unsurprising behavior will inform users a lot more than log messages, which they may not always collect or see.

Some kinds of libraries, especially framework-style ones that call user code rather than the other way around, can’t make this simplification. If you’re writing the next NancyFx, Topshelf, or MassTransit, you’ll face the problem of how to write log events for developers to consume.

Applications get to choose the logging subsystem, and your library has play well with that.

LibLog1

In the past, we’ve dealt with this through abstractions like Common.Logging. The abstraction defines an interface, and a framework-specific adapter implements this in terms of whatever logging library is in use:

LibLog2

While logging abstractions have been a great asset to the ecosystem, they suffer some serious downsides.

First, abstractions target the lowest common denominator. Why do you choose one logging library over another? Often it’s because you value some power that its API affords. Abstractions usually have to wipe away these advantages in order to fit a variety of implementations.

Second, abstractions bring dependencies. A single-assembly or single-package library that supports a shared abstraction now becomes three or more dependencies in actual usage: the library, the abstraction library, the binding between the abstraction and the logging library, and the logging library itself. All of these need to be maintained and versioned.

Third, abstractions proliferate. Because of the first problem, someone’s always looking to create a better abstraction, and because of the second problem, this leaves applications peppered with implementations of multiple similar abstractions.

Given this unsatisfactory situation, most popular libraries end up defining an interface that they expose for the user to plug logging in. The library can define the abstraction at the level it considers “ideal”, and since the abstraction is part of the library at least some dependency proliferation is avoided.

Still, this isn’t a great place for applications to be. Libraries get to stay relatively “clean”, but now since every library potentially defines its own logging interface, the job of binding all of these and wiring the result up is a chore. Making matters worse, IntelliSense for ILog… in Visual Studio now presents a dozen suggested types to choose from. Every time. Ick.

Enter LibLog.

@ThomasArdal LibLog within libs. Serilog for apps. For the record, Log4net died with the signing key fiasco

– Adam Ralph (@adamralph) April 25, 2016

LibLog is not Serilog, so why am I writing about it? Well, it’s a bit cynical of you to ask ;-) but since you did: LibLog is different. Serilog is for apps. LibLog is built specifically for libraries, to address the pain points I mentioned above. I’m an admirer because we’ve lived with this problem for so long, I’d given up on an elegant solution ever emerging.

When you install the LibLog package, a source file is embedded directly into your project. There’s no third-party assembly or runtime dependency at all. The source file defines an ILog that is internal to your project: there’s nothing polluting your public API surface. The source file also includes some reflection-based code to sniff out which concrete logging library is actually in use, and some Expression-based code generation to efficiently connect to it.

LibLog3

All of this happens automatically, behind the scenes, without a user of your library having to know or care how things are wired up. It’s like magic! Its the future.

LibLog is also smart: you can use message templates with it to do structured logging a-la Serilog, Microsoft.Framework.Logging, Logary 4, and maybe even a future version of NLog.

usingClassLibrary1.Logging;namespaceClassLibrary1{publicclassClass1{publicstaticvoidSayHello(intnumber){varlog=LogProvider.For();log.InfoFormat("Hello, the number is {Number}",number);}}}

Used in an application with Serilog, this event has the Number property and its value attached. If the underlying logger doesn’t support it, the message is translated into a regular .NET format string and passes through with full fidelity.

If you want something more from your logging interface – say, you want to be able to pass an integer event type with every message to cater to some new framework – it’s just a matter of tweaking the code file that’s already part of your app. Admittedly there’s got to be a common denominator, but you get to choose what it is, and it doesn’t have to be the lowest.

What are the downsides of using LibLog? Your library gains a few kilobytes of adapter code internally that need to be carried around. The weight of the code is comparable to what an external abstraction would cost, so it’s zero-sum in reality.

Why is LibLog the future? From where we stand right now, amidst a mature ecosystem of .NET software, no one – not Microsoft, nor you, nor me – can define the one logging abstraction to rule them all. We can’t put the genie back into the bottle! We have so much code between us in libraries and applications that all of the logging libraries we have, the abstractions, adapters and other gunk, will live on practically forever. Any new abstraction, no matter how clean, will just add one more to the long list we already drag along with us.

So, how about letting your library’s users off the treadmill? LibLog for libraries, Serilog (if you like it) for apps. Couldn’t have said it better myself :-).

To get started with LibLog check out the wiki or this [brief introduction from Christos Matskas.

Serilog 2.0 Release Plans

$
0
0

.NET Core and ASP.NET Core are expected to release-to-web by the end of June 2016. There have been a few related announcments recently, surrounding the split of the runtime release (end of June) from the tooling release and return to MSBuild (tied to a later Visual Studio release).

This post is just a quick note to clarify what’s happening with Serilog.

What’s in 2.0?

As many of you noticed, we made an early leap into the DNX tooling and related libraries. Version 1.5 of Serilog, the most recent master release, has been sitting stable for quite a while now, while our long-lived dev branch has been tracking .NET Core for the last year, collecting changes and improvements in readiness for the new .NET.

Here’s what we’ve been focusing on in Serilog 2.0.

  • Packaging– the core Serilog package is now a lightweight, low-dependency package with core types like Log, ILogger and LoggerConfiguration; sinks, enrichers, and facilities like `` support have moved out.

  • Portability– much more of Serilog now works across the board; for example, we can support LogContext on all target platforms; there’s no more Serilog.FullNetFx.dll.

  • Usability– a few rough spots in the API, especially concerning logger shut-down& disposal, have been cleaned up.

  • Performance– for example Serilog.Sinks.PeriodicBatching now accepts and enqueues writes without lock contention.

  • IntegrationSerilog.Extensions.Logging is a high-fidelity logging provider for ASP.NET Core that gets structured log data from your apps and the platform out to a huge variety of mature and supported sinks of all kinds - databases, log servers, the cloud, rolling files, …

There are plenty more goodies in there. (One of my favourites is this PR which landed support for fixed-length level names. In 2.0 you can use {Level,3:u} in output templates to get output like DBG, INF, WRN and so-on.) When we flick the switch on the final version we’ll publish full release notes.

What do the recent announcements mean for Serilog?

By far the biggest change has been the packaging work. Serilog used to include its own file, console and trace sinks, for example. These now live in their own repositories and packages: Serilog.Sinks.RollingFile lives in here on GitHub for example. This doesn’t just reduce dependency bloat – it’s also an important milestone in scaling out the Serilog project to support growth and community contributions (like this excellent alternative rolling file package which can now stand on an equal footing with the original).

Re-packaging Seriog has meant a lot of grappling with the in-progress tooling. DNX + DNU (soon dotnet) and project.json have brought game-changing improvements for us – especially the package building capabilities that mean a single build can compile (and then test!) for multiple target platforms. The earlier build system we’ve used to emulate this in Serilog 1.5 has been miserable. I’m sad to see project.json go, but happy that the major benefits are being carried through to the final version (even if we’re stuck with XML). I’m not thrilled about going through another round of churn, but I trust the .NET and ASP.NET engineers in their commitment to deliver a fully-baked, high-quality tooling release so however we have to get there – we’ll get there!

Roadmap to 2.0

Anyway, I digress :-) . Here’s how things will go for Serilog:

  1. We’re going all-in on RC2. It’s going to be a close, if not identical, reflection of the RTM runtime, so once it drops the Serilog core and as many sinks as possible will be ported. This also means nailing down the target framework monikers that the packages support, and paring back our declared dependencies to the bare minumum. The Serilog package will move to 2.0.0-rc-* versioning to reflect this.

  2. Serilog 2.0 will RTW as soon as .NET Core does, in June. The Serilog package will drop first (ASAP) followed by the supported sinks in roughly the order of usage.

  3. We’ll continue building/shipping Serilog with the RC2 version and later previews of the project tooling until the Visual Studio release when we can move forwards with it. From RC2, we’ll treat the tooling as if it’s RTM, and make sure Serilog and its sink/enricher/configuration packages work flawlessly with it, as well as the released Visual Studio 2013 and 2015 tools.

The Serilog contributors and community have put in a lot of work to get us to where we are today. Serilog’s the most mature and full-featured logging solution you’ll find on .NET Core, thanks to a huge effort tracking the platform changes to date. When .NET Core hits RTM, Serilog is going to be there, as we’ve always planned! I can’t wait for June.

When should I upgrade?

RC2. We’ve treated RC1 as a “best-effort” target so far, and supporting Serilog 2.0.0-beta-* has focused mostly on unblocking early adopters of the new runtime. From RC2 I’ll personally consider .NET Core a supported platform and pursue any Serilog issues on it as if they’re in an RTM. I think the promise of a .NET Core/ASP.NET Core RTW shortly afterwards will mean a lot of Serilog users and contributors will be moving forwards, so Serilog 2.0.0-rc-* is going with us.

How can I help?

If your favorite Serilog sink isn’t ported to .NET Core yet, or isn’t ready at RTW, please consider how you can chip some effort in – most porting work is a straightforward project file change and recompile. There are a lot of people in the Serilog community ready to help out; the best way to get started is to jump onto Gitter and mention you’re interested in sending a PR.


Structured logging concepts in .NET Series (1)

$
0
0

Between now and the Serilog 2.0 release that’s due sometime in July, I’ve decided to go back to basics and write some posts on the fundamentals of modern structured logging in .NET.

In the complex, distributed, asynchronous systems being built today, answering questions about behavior – what caused this issue?– is a horribly tedious thing to have to do without the right instrumentation. I hope this series will be useful for anyone wanting to build more manageable .NET apps with the new crop of frameworks and tools.

The examples use Serilog, but most of the same concepts appear in other libraries, so where possible I’ll call out the alternative APIs.

A tentative plan of attack, following this post:

But now, let’s get the ball rolling!

What is structured logging?

Approaches to application diagnostic logging can be roughly divided into text-based logging and structured, or event-based, logging.

Text logging, sometimes also called ‘printf debugging’ after the C printf() family of functions, is the first one most programmers encounter. As your code executes, it prints lines of text out to the terminal that describe what’s going on.

12:23:10 [INF] Service "loyalty-card-printer" starting up on host 8fd342hkg22u
12:23:11 [INF] Listening on http://8fd342hkg22u.example.com:1234
12:23:20 [INF] Card replacement request received for customer-109972
12:40:55 [INF] Card replacement request received for customer-98048
…

Because text logs are written as a narrative in natural language, they’re nice and easy to read.

Text is a very expressive format, and tools including full-text indexing and regular expressions can do a lot with it, but answering anything non-trivial with text log data can still take some painstaking work.

Structured logging is an umbrella term for alternatives focusing on easier searching and analysis at scale. A structured log event, rather than being a line of text, is conceptually a timestamped set of name/value properties.

Structured logging systems separate the way the event is captured (the logging API) from the way that it is transported (perhaps JSON) and how it is rendered in a user interface or file.

A rough translation of the sequence above to structured events might be conceptually like:

time=12:23:10, level=INF, service=loyalty-card-printer, host=8fd342hkg22u, activity=started
time=12:23:11, level=INF, endpoint=http://8fd342hkg22u.example.com:1234, activity=listening
time=12:23:20, level=INF, customer=customers-1099972, activity=replacement
time=12:40:55, level=INF, customer=customer-98048, activity=replacement
…

Structured logs are easier to navigate because the properties in the event are first-class: a log server can return results for a query like activity == "replacement"&& customer == "customer-98048" without any configuration to describe how the events should be parsed.

The line between structured and text logs is blurry:

  • Text logs can provide some structure by encoding well-known fields like the timestamp, logging level and thread id at the start of each line, and by including name=value formatted fields in the message itself

  • Structured logs sometimes look a lot like text logs when formatted out to a flat file or the terminal, and often include a message field that is itself free text

But, these similarities are skin deep. Structured logging and text logging require different mindsets in order to use each well.

Historical barriers to adoption

Structured logging isn’t always 100% rainbows-and-unicorns-awesome. My guess is that two warts have kept adoption of structured logging in second place.

First, structured logging implementations can be more labor-intensive at development time than text-based alternatives.

In .NET, the EventSource API illustrates this well. Writing a structured log event with EventSource requires an event definition in the form of a C# method in a separate class, along with the line of code to actually record the event:

[Event(123, Message = "Card replacement request received for: {0}")]internalvoidCardReplacementRequested(stringcustomerId){if(this.IsEnabled())this.WriteEvent(123,customerId);}// Elsewhere:
// eventSource.CardReplacementRequested(customerId);

Switching between files and creating an annotated method just to record a single log event is a tough cost to bear, in comparison with the single line of code to record a typical text log event.

Second, many structured logging implementations produce output that is harder to visually scan. This phenomenon is exaggerated in the earlier example – there are nicer ways to format the same structured events – but it’s still very hard to complete with natural language for readability.

The topic of this series is the most recent structured logging approach for .NET, specifically designed to avoid these issues.

Logging with message templates

Since the technique was introduced by Serilog a few years ago, many .NET applications have adopted a style of logging that captures a structured event, as well as a human-friendly text representation of it, in the same line of code.

// var customerId = "customer-109972";
Log.Information("Card replacement request received for {CustomerId}",customerId);

The description of the event, called the message template, uses named ‘holes’ instead of the anonymous {0} and {1} numeric placeholders found in .NET format strings.

Under the hood, instead of immediately substituting the value of customerId into the template to create a text event, the logging library captures the template and the argument values individually. By matching the named holes in the template with the positional arguments, the customerId value is associated with a property called CustomerId on the event.

You can imagine the resulting event in JSON format might look like:

{"time":"2016-06-01T02:23:10Z","level":"INF","template":" Card replacement request received for {CustomerId}","properties":{"customerId":" customer-109972"}}

The event can be rendered as text using the template and properties, or stored and searched as structured data, achieving the best of both worlds: events that are optimally human-friendly and optimally machine-readable.

At the time of writing, you can use message templates in .NET with Serilog, Microsoft.Extensions.Logging (ASP.NET Core) and Logary.

(I actually heard the term “modern” used to describe this style of structured logging via Jason Roberts’ Pluralsight course on Serilog and Seq, which is a couple of versions behind but still well worth checking out if you have a subscription.)

It sounds like I’ve told the whole story of modern structured logging already with this post! One line of code is enough to convey the basic idea, but there really is more to know if you want to use it effectively. I’m aiming to fill two installments of this series each week, so if all goes well you won’t have to wait too long for the next post on events and levels….

Events and levels - structured logging concepts in .NET (2)

$
0
0

Introductions are tough, I’m glad that’s out of the way. This post is going to dig straight into events and levels. If concepts like events, timestamps and logging levels are old news for you, this might be one post in this series to skip. Saying a few words about these things up front will make it easier to dig into the nitty-gritty later though, with everyone on the same page.

We instrument a complex applications with logging so that when things go wrong (or when we need to verify that things are going right!) there’s a trail of events that have taken place in the system, for us to inspect.

Thinking about system events as, well, events is fairly natural: something happened in the system at a particular time and needs to be recorded. But, if you’re coming from a purely text-based logging mechanism (perhaps just writing logs to System.Console then it’s a good time now to pause and consider that events are truly a first-class concept in structured logging.

At a minimum, an event is a timestamp and some data, usually comprising a list of named properties.

EventsTimeline

The timestamp is recorded automatically when a log event is captured, so despite being one of the most important elements of an event, it doesn’t show up much in logging code.

To make real-world logging systems convenient to work with there are other special properties, common to all events, that are given additional meaning by the system.

SpecialProperties

Among these special properties you’ll usually encounter a message of some kind, but how exactly that works in Serilog and similar libraries is the topic of the next post, so we’ll park that here and just consider messages to be simple strings.

Beside the timestamp and message, the level (or ‘severity’) of an event is the next thing you’ll find. This is so familiar and ubiquitous that it’s worth discussing what levels are, and are not, in structured logging systems.

Application logging is a balancing act, constrained by the performance cost of producing events, the infrastructure cost of transmitting and storing them, and the attention cost of someone reading or being interrupted by them.

In traditional logging, where the level is one of the few first-class concepts apart from the message, it’s the basis for controlling most of these cost trade-offs. When events are recorded, they’re given a level like debug, warning, or error.

Log.Debug("Checking for new mail");Log.Information("Setting new pricing");Log.Warning("Incoming projectile detected");Log.Error("Could not connect to configuration server");

If you want to follow along and experiment in Visual Studio, now’s a great time to follow the Getting Started instructions on the Serilog wiki to create a basic console application with Serilog configured.

Levels are arranged in a hierarchy from least to most important, and the application is configured to generate events at a certain level or higher. Logging statements below the minimum level are no-ops.

This role - maintaining application performance by reducing log volume under normal operation - is an important function of the level even in apps that use structured logging.

In Serilog, the default level is Information, so the first event above won’t be emitted. The MinimumLevel option can be used to override this:

Log.Logger=newLoggerConfiguration().MinimumLevel.Debug().WriteTo.LiterateConsole().CreateLogger();

Besides using levels like debug to control verbosity, the error level may flag something that should be fixed, and the warning level flag something to investigate, but operationally, levels take a back seat once structured data comes into the picture. Levels might be considered for log retention, and also when generating alerts and notifications, but other application-specific properties often play as much or more of a role in these scenarios.

My rule of thumb, when in doubt, is that if it’s worth recording in production an event is Information. The Serilog wiki has a little more information on the levels provided there.

Custom levels, a common feature of older libraries like the log4* family, exist because without rich structured data to examine, the level field has to take on a lot of responsibility. Instead of having a less-important-error level for exceptions in a non-critical subsystem, and a more-important-error level for another code path, properties of a structured log event describe the context of an error and can be used to make decisions about what to do with it.

With the basic terminology covered, I’m looking forward to discussing message templates and properties in the next post, where modern structured logging starts to diverge from what most of us are familiar with.

Message templates and properties – structured logging concepts in .NET (3)

$
0
0

This post is the third in a series; start here.

Trojan eventing

A long time ago I encountered a library called Bamboo.Prevalence, by the amazing Rodrigo “Bamboo” de Oliveira. Following Java’s Prevayler, the idea underpinning object prevalence is that rather persisting your application’s object model as a point-in-time snapshot, a log of commands affecting the model (Add item 5 to order 1!) is written instead. Later, to recreate the model at a point in time, the application can just replay those commands from the beginning to arrive at the desired state.

I was fascinated by the idea of object prevalence. It turned the whole notion of application state from a static, opaque thing into a dynamic and detailed view of the application’s history.

Prevalence never really took off in .NET because of the practical challenges associated with data migration, reporting, integration points and so-on. Existing infrastructure expected a SQL database, and moving away from that model would have necessitated “rewriting the world”.

Still, a little later on I encountered this idea again in the guise of event sourcing, an architectural pattern based on a similar application-state-as-transaction-log model.

As with object prevalence before, I found event sourcing very interesting, but as with object prevalence again, the divide between the architecture and the status quo in typical enterprise software environments kept it out of most projects I was involved with.

Events are exciting because they capture a whole dimension that’s otherwise invisible, but equally if not more important than the static view that the current system state describes. In .NET, messaging architectures sometimes capture an event-oriented view of an application, and Reactive Extensions (Rx) has had some success with getting a first class treatment of events into .NET apps, but events are still usually a secondary, transient consideration and hence a lost opportunity for diagnostics and analysis.

Eventually, down the line, I stumbled across what you might call Trojan eventing, a sneaky way of grafting a fully-structured, persistent, queryable event stream onto an application, without a change of architecture and in most cases, without many of your teammates even noticing! Hey, your app right now might be producing a first-class event stream! To help you recognize it, this post describes what it looks like ;-)

Capturing vs. formatting log event data

The classic workflow associated with text logging goes something like this:

  1. Take the description of an event and associated parameters

  2. Format it into human-readable text

  3. Collect the text

  4. Parse the text to recover individual parameter values

  5. Filter based on parameters

  6. View matching events

If you break this down, the formatting step (2) actually achieves two things. The event is made into something human readable (the formatting part), but by serializing the whole thing as text, the associated parameters are also being captured in a form that can survive outside the application.

Step 4, where we need to expend effort to recover information from the event for machine processing, comes about because we threw all of the structural information away at step 2. This is painful because practically every interesting thing we can do with a log event revolves around the specific parameter values associated with the event. Writing regular expressions to extract those values gets tiring quickly.

To fix this, structured logging approaches data capture and formatting separately. In the structured logging workflow, capturing happens first while formatting an event into human-readable text is deferred until it’s needed later on, after any property-based filtering has been applied:

  1. Take the description of an event and associated parameters

  2. Capture the description and parameter values

  3. Collect a serialized representation like JSON

  4. Filter based on parameters

  5. Format into human-readable text*

  6. View matching events

*Formatting can be deferred, but it doesn’t necessarily have to be – the collected JSON can include the human-readable representation alongside the template and parameters, if it makes processing easier in subsequent steps.

What this all amounts to is that the log statements already in your application:

Log.Information("Checked out order {0} for ${1:C}",order.Id,order.Total);// -> Checked out order o12345 for $15.30

Can be converted into fully-structured, first-class events with almost insignificant effort:

Log.Information("Checked out order {OrderId} for ${Total:C}",order.Id,order.Total);// -> Checked out order o12345 for $15.30
// -> { … "OrderId": "o12345", "Total":"15.3"… }

The message template "Checked out order {OrderId} for ${Total:C}"names the parameters of the log event (and in some cases describes how they are captured) so that they can be represented as properties in a structured format like JSON, and easily manipulated without the need for free-text parsing.

Message template basics

Message templates are a superset of .NET format strings. That means every valid .NET format string is also a valid message template.

Log.Information("Checked out order {0} for ${1:C}",order.Id,order.Total);

The property names captured in the example above will be 0 and 1.

While numeric property names are a useful compatibility feature when moving code from a format-based logging implementation like log4net, descriptive names are much more pleasant to work with. Message templates extend format strings by allowing textual names to be used instead of numeric placeholders.

In a message template with named placeholders, the parameters specified alongside the template are always matched from left to right, that is, the two properties First and Second will have the values 1 and 2:

Log.Information("Capturing {First} and {Second}",1,2);

When writing a message template, either numeric or named placeholders must be used. While libraries like Serilog will try to do something reasonable when confronted with a mix of named and numeric placeholders, doing so doesn’t make much sense.

This is because when capturing, named placeholders are matched strictly left-to-right with the arguments, while numeric placeholders in .NET format strings allow out-of-order capturing and repeated substitutions of the same index.

Because of their relationship to standard .NET format strings, width and format specifiers are also supported in message templates.

Log.Information("Capturing {First,8} and {Second:x8}",1,2);// -> Capturing        1 and 00000002

(To fully support deferred rendering, both the raw property value and its formatted representation need to be captured up front in this scenario.)

Likewise, braces in the message itself can be escaped by \{\{ doubling them.

Validating message template usage

SerilogAnalyzer is a Roslyn-based Visual Studio plug-in by Robin Sue that validates message template syntax and checks binding to parameters.

Usage issues result in a red squiggly, and an error or warning in the Visual Studio error list.

SerilogAnalyzer

You can download and install the Serilog Analyzer VSIX package from the GitHub project page.

Capturing structure

The other extension that message templates add to typical format strings is control for how parameters are captured.

By default, a parameter logged alongside a message template will be converted into a simple scalar value like string or int. This matches our intuition for how format strings are supposed to work: if you provide a complex object, say, a BlogPost instance, it’ll be logged via its ToString() representation:

varpost=newBlogPost("Structured Logging Concepts"){Tags={"logging","diagnostics","analytics"}};Log.Information("Publishing {Post}",post);// -> Publishing Structured Logging Concepts

This is a very limiting: later when searching the log, it would be useful to be able to find, say, all published posts that are tagged “diagnostics”.

By prefixing an @ sigil ahead of the property name, the post object will be captured in a structured representation instead.

Log.Information("Publishing {@Post}",post);// -> Publishing BlogPost { Name: "Structured Logging Concepts", Tags: "logging", … }

If the destination format is JSON, Post will be serialized as an object with detailed Name and Tag properties. (Serializing logger – so now you know ;-) )

Up next…

We’ll see how complex data is captured in more detail in the upcoming post on serialization. Up next we’ll consider event types, the final detail necessary before we can call the discussion of message templates complete.

Serilog 2.0 JSON improvements

$
0
0

Serilog 2.0 is out! There’s an Upgrade Guide and Release Notes already posted on the Serilog site, and a good number of first- and third-party packages have already been updated to work with the new version. Go go go! :-)

Building the new release was a team effort, and announcing it will be as well, so instead of trying to cover everything we’ve added since the 1.x series I’m going to write about a few new features I am especially interested in, and call out posts from the rest of the team as they appear.

Regarding upgrades by the way: if you get stuck, please post to Stack Overflow and tag your question with serilog– many Serilog users and contributors are there ready to help out, and by posting there you’ll help others find the information they need. Thanks!

JSON in Serilog 2.0

JSON turns out to be a good fit for storing and transporting structured log events. Practically every modern logging target of any significance requires, supports, or uses JSON in one way or another.

Serilog has always had good support for writing JSON logs, but when JsonFormatter was introduced in Serilog 1.0 I underestimated its importance, and underestimated the breadth of scenarios in which it would be used. This left us with several longstanding pain points:

  1. Efficiency - the default format written by JsonFormatter is a straightforward mapping from Serilog’s LogEvent type, which requires a lot more space than is strictly necessary (even for JSON!)

  2. Flexibility - as formatting requirements expanded, JsonFormatter did too, eventually becoming a big ball of virtually-dispatched mud, that’s unable to evolve without breaking subclasses

  3. Ubiquity - practically everywhere that Serilog supports plain text, JSON support is needed too, but not being comfortable with the format or implementation we’ve held back in a number of places

If you use JSON logging with Serilog, or maintain a Serilog sink that writes JSON, Serilog 2.0 has some improvements in these areas. In this post we’ll look at some new formatting infrastructure, a new compact JSON representation, and finally some API improvements that make it easier to use JSON with Serilog across the board.

Introducing JsonValueFormatter

Sinks that use JSON need to take a Serilog LogEvent and write it out to a stream of text. There are two main tasks involved, both of which were previously the responsibility of JsonFormatter:

  • Formatting the fixed properties of LogEvent such as Timestamp, Level and the message, and

  • Formatting the associated property data encoded in LogEventPropertyValue objects.

Writing out the fixed properties is easy, but varies a lot between JSON formats. The amount of code involved is so small that, with the help of a JSON string escaping function, its cleaner and faster for implementers to hand-code than it is for Serilog to provide an infinitely-flexible generic formatter.

Writing out property data is harder – but much more consistent. For that task, Serilog 2.0 provides JsonValueFormatter:

varformatter=newJsonValueFormatter();LogEventevt=// …
foreach(varpropertyinevt.Properties){formatter.Format(property.Value,Console.Out);Console.WriteLine();}

Given an event with properties like Name = "Alice" and Number = 42 this snippet will produce output like:

"Alice"
42

Serilog encodes all structured data, including scalars like the ones above, arrays, dictionaries and objects, into LogEventPropertyValues. JsonValueFormatter takes these values and writes the corresponding JSON. Because it’s focused on just this one task, it’s faster at it and easier to extend.

Putting JsonValueFormatter together with hand-coded serialization for the fixed properties yields much more understandable (though also a little more verbose) formatters. Here are threedifferentimplementations for reference.

What’s the role of JsonFormatter in the future, then?

It’s useful having JsonFormatter around to provide an easy, sensible, default JSON format. I don’t think we should deprecate or remove it - only its extensibility story is truly problematic.

A more likely plan of attack is to deprecate its extension points, so that sinks using it to implement custom JSON formats “get the hint” and move over to the new model based on JsonValueFormatter. Once this happens, we’ll be able to clean up and optimise JsonFormatter so that everyone using it as-is can benefit from better performance, and code that’s now duplicated in JsonValueFormatter can instead be reused.

Compact JSON support

Since JsonValueFormatter makes it relatively easy to support different JSON layouts, we’ve done just that: the Serilog.Formatting.Compact package includes two new JSON formatters that blow JsonFormatter out of the water in space and time benchmarks.

Let’s check out the “hello world”. We’ll need a few packages:

Install-Package Serilog
Install-Package Serilog.Sinks.Console
Install-Package Serilog.Formatting.Compact

The example will write the same event in both the default and compact JSON formats:

staticvoidMain(string[]args){Log.Logger=newLoggerConfiguration().WriteTo.Console(newJsonFormatter()).WriteTo.Console(newCompactJsonFormatter()).CreateLogger();Log.Information("Hello, {Name}",Environment.UserName);Log.CloseAndFlush();}

2.0 tip: though it’s not required by the console sink in this example, I’ve added the call Log.CloseAndFlush() as a reminder that you need to call this explicitly on app shutdown if you’re using any sinks that buffer events (most networked ones).

We get two lines of output. I’ve inserted some linebreaks to keep things tidy in the blog post:

{"Timestamp":"2016-07-06T12:27:19.6552446+10:00","Level":"Information","Message
Template":"Hello, {Name}","Properties":{"Name":"nblumhardt"}}{"@t":"2016-07-06T02:27:19.6552446Z","@mt":"Hello, {Name}","Name":"nblumhardt"}

140 vs 79 bytes (56%) for the same information.

CompactJsonFormatter is smart: built-in field names are short and prefixed with an @ so that no Properties property is required; timestamps are converted to UTC; and, in this example the level is omitted since Information is considered to be the default.

CompactJsonFormatter assumes you want to work with the event as a template + properties. This isn’t always the case – many log collectors require that the message is pre-rendered into text. This is where RenderedCompactJsonFormatter shines.

Substituting it into the example program produces an event that, at 102 bytes, is almost as compact as CompactJsonFormatter, and still more compact than JsonFormatter even without it working in its “rendered” mode:

{"@t":"2016-07-06T02:33:46.6978713Z","@m":"Hello, \"nblumhardt\"","@i":"5e2f8db
d","Name":"nblumhardt"}

Note how instead of @mt we’ve now got the rendered message in @m, and a new field, @i, which is a computed 32-bit event type based on the message template.

You can read more about the new format here, including some benchmarks that show the new formatters beating the existing one by a healthy margin. (Another new fan of BenchmarkDotNet here.)

JSON everywhere

Finally, and you might have noticed this in the sample code above, basic sink configuration methods like Console() now have overloads accepting an initial formatter argument. This makes it easy to use one of the new JSON formatters with the console, file, or rolling file sinks – in the past, you had to use a less-discoverable configuration syntax in order to do this.

Here’s the JSON-rolling-file version:

Log.Logger=newLoggerConfiguration().WriteTo.RollingFile(newCompactJsonFormatter(),"./logs/app-{Date}.json").CreateLogger();

Don’t forget that here, as elsewhere in Serilog 2.0, the sink lives in its own package. Install-Package Serilog.Sinks.RollingFile will get this example up-and-running.

Updates coming in Serilog 2.1 (grab the -Pre version from NuGet) mean that you will soon also be able to configure a JSON rolling file using XML as well:

<appSettings><addkey="serilog:using:RollingFile"value="Serilog.Sinks.RollingFile"/><addkey="serilog:write-to:RollingFile.pathFormat"value="./logs/myapp.json"/><addkey="serilog:write-to:RollingFile.formatter"value="Serilog.Formattting.Compact.CompactJsonFormatter, Serilog.Formatting.Compact"/>

2.0 tip: since the sinks have moved to their own packages, you’ll need serilog:using directives like the one shown above in order for XML configuration to pick them up.

Wrapping up

Serilog 2.0 is out; as well as .NET Core support there are a lot of niceties to enjoy on all platforms. Once again, check out the Upgrade Guide and Release Notes for the full picture.

I’ve dug into JSON formatting here because the situation has annoyed me for so long :-) but there’s plenty more to tell. Interspersed with the rest of the introductory series on structured logging I’ve been working on, I hope to be able to write more soon.

In the meantime, it’d be great to hear from you if you’re already across to Serilog 2.0, or if anything is holding you back. Happy logging!

Event types – structured logging concepts in .NET (4)

$
0
0

Structured logging is awesome, but the top reason for me is easily overlooked:

There are orders of magnitude more events in a log than there are logging statements that generated them.

Have you ever opened up a log file looking for the cause of an issue and just felt helpless? It’s a wall of text! Out of a million events, how can you track down the one or two events with interesting information? When you don’t know what you’re looking for, you can’t search for it.

Even an empty ASP.NET Core application generates a flood when debug logging is turned on. I ran the WebSample project found here and hit refresh a few times. Even the few hundred events that generated is hard on the eyes.

Check this out, though:

We start with a dense log stream containing all kinds of messages, and then - by type - exclude them until only a few interesting ones remain! How does that work???

Message templates as event types

When you write several events with log4net, the message produced for each one is similar:

foreach(varalertinalerts){log.Debug("Disk quota {0} exceeded by user {1}",alert.Quota,alert.Username);}// ->
//     Disk quota 100 exceeded by user Alice
//     Disk quota 150 exceeded by user Bob

But, as far as a log search is concerned, they’re just two lines of different text.

You may wish to find all “disk quota exceeded” events, but the approach of looking for events like 'Disk quota%' falls down as soon as another event appears that is superficially similar:

Disk quota 100 set for user Carol

Text logging throws away the information we initially have about the source of the event, and this has to be reconstructed with increasingly fragile and elaborate match expressions as the application gets more complex.

By contrast, when you write events through Serilog, the library keeps track of the message template as well as the arguments that are subsituted in:

log.Debug("Disk quota {Quota} exceeded by user {Username}",alert.Quota,alert.Username);// ->
//     Disk quota 100 exceeded by user "Alice"

The plain text rendering is like the log4net version, but behind the scenes, the "Disk quota {Quota} exceeded by user {Username}" message template itself is associated with all of the events created through this logging statement.

With the events stored in a log server or database, you can query where @MessageTemplate = 'Disk quota {Quota} exceeded by user {Username}' and get exactly the events where the disk quota was exceeded, no more and no less.

This is pretty much how the screencast above tunes out the noise. Excluding events by event type is extremely powerful - even in a very busy log stream, a huge proportion of the events will come from a handful of logging statements. Once you can exclude messages by type, you can filter down to the more interesting and unusual events quickly.

If typing out message templates in log searches sounds like a bore, there’s one small ergonomic improvement to mention.

Event types from numeric hashes

It’s not always convenient to store the entire message template with every log event. Though they act the part, message templates are also a bit of an awkward way of identifying a set of related events.

To make message templates feel like a more natural identifier, they can be hashed into a numeric value that’s easier to work with. This means instead of typing Disk quota {Quota} exceeded by user {Username} you can deal with a number such as 08fea8bb.

"Disk quota {Quota} exceeded by user {Username}"-hash->0x08fea8bb"Disk quota {Quota} set for user {Username}"-hash->0xa990e732

Where the hashing is performed, depends on your logging pipeline:

  • For simple log files, you can plug an enricher into Serilog that will assign an EventType property automatically to all Serilog events.
  • The SerilogRenderedCompactJsonFormatter includes an event type in its JSON output, which is a great option if you plan to store events as JSON.
  • Seq (shown in the screencast) will do this automatically for events from Serilog or Microsoft.Extensions.Logging. Other log servers have various ingestion hooks through which you can plug this in.

You can use the technique mentioned in the first point with just about any log storage mechanism, so it also provides a good fallback.

Explicit event types

The Microsoft.Extensions.Logging API supports message templates, and therefore numeric hashes as well, just like Serilog. For simple log statements, the only difference between the two APIs is the addition of a Log prefix to the method name:

log.LogDebug("Disk quota {Quota} exceeded by user {Username}",alert.Quota,alert.Username);

Because Microsoft.Extensions.Logging is an abstraction fitting many different logging libraries, it’s also possible to specify event types manually by passing one as the first (eventId) parameter:

log.LogDebug(1001,"Disk quota {Quota} exceeded by user {Username}",alert.Quota,alert.Username);

There are a couple of advantages of explicit event ids:

  • They remain stable even if the message changes
  • They may be supported by some back-ends that don’t or can’t deal with message templates

The disadvantages are that there’s more to type, and more effort involved in ensuring each id is unique. In some contexts that will make sense, but it’s unnecessary friction in others. My advice is to use message templates - or hashes programmatically derived from them - whereever possible.

There’s no time like now

The sooner you start storing and using event types, the better your diagnostic experience will be. Unless you’re using them day-to-day, it’s easy to log pre-formatted text without realizing it, and compromise the great noise-cancelling power that event types can give.

By capturing the original template that generated an event, or a numeric hash of it, an log stream containing millions of events can be interrogated as a log stream of a few hundred event types. It’s a profound change in perspective and something you have to try.

This post is the fourth post in a series; up next, context and correlation.

Viewing all 102 articles
Browse latest View live