Announcing Seq 2 RTW

June 9, 2015, 2:28 pm

≫ Next: Contender for .NET’s Prettiest Console?

≪ Previous: Filtering with Signals in Seq 2

It’s with pleasure I can announce that the RTW build of Seq 2.0 is now available.

What is Seq?

Seq is a log server designed specifically to take advantage of modern structured logging techniques. Libraries like Serilog and SLAB produce log events that are not only readable messages, but richly structured objects that can be searched and correlated with ease. Seq is the fastest way for .NET teams to capture and investigate these logs when debugging code in complex environments.

With Seq it’s trivially easy to find all events raised in a particular HTTP request, from a particular machine, or when processing a particular order, for example. Not only that, but queries unimaginable before the rise of structured logging – “show me all the orders processed with an item over $100” – can be expressed in a clean C#-style syntax and answered directly from logs.

What’s New?

The centrepiece of the 2.0 release is a completely redesigned query interface, featuring auto-completion and syntax help for filtering, click-to-apply signals, and easier sharing for users working in larger teams.

Seq 2.0

Seq 2.0 also paves the way to much easier temporal navigation with a new timeline view and to-the-millisecond date and time filtering.

In addition, almost every corner of the interface has been re-thought, and many small improvements made to ergonomics and usability.

Installing and Upgrading

If you’re not using Seq today, you can install it for free under the complimentary single-user license, or get a trial key that enables authentication and other team-oriented features on our website.

If you purchased Seq in the last year, you can upgrade under your existing license: just install the new version and your data will be migrated in place. Easy!

If you’ve used up your 12 months included upgrades already – you’re in luck: to celebrate the release of the new version we’re offering existing customers discounted upgrade pricing this month. Email Support and we’ll organize it for you and make sure everything goes seamlessly.

Documentation and Support

With this release we’re launching a new combined documentation and forum site at docs.getseq.net. Check out the new Getting Started page and keep an eye out for new content as we migrate it from the older Confluence site.

Thanks and Acknowledgements

This release was made possible by our wonderful customers, and a brave bunch of early adopters who’ve given us invaluable feedback and bug reports during the beta period. Thank you all!

Download Seq 2.0 now.

↧

Contender for .NET’s Prettiest Console?

June 11, 2015, 3:00 am

≫ Next: Seq 2.1 Refresh

≪ Previous: Announcing Seq 2 RTW

Working on the UI for Seq 2 involved a lot of time at the command line with Node.js-based tools like npm, Gulp and Mocha, and the output from these is great. There’s heaps of love put into great formatted and colored output in the Node.js ecosystem, and switching back to .NET on the server left me with console envy…!

A few months back I decided to take matters into my own hands and see how Serilog’s trusty old colored console output could be improved upon. The result was the Literate Console sink, it’s now become my go-to option for command line apps.

Here are a few typical Serilog events:

Log.Debug("Starting up on {MachineName} at {WorkingSet} bytes", machineName, workingSet);
Log.Information("Hello, {Name}!", userName);
Log.Information("Recieved order {@Order}", new { Item = "Desert", Quantity = 3, IsReady = true });
Log.Error(ex, "Could not divide {Numerator} by {Denominator}", numerator, denominator);

And here’s how they’re rendered by the Literate Console sink:

Literate

Yup, the structured log data is pretty-printed right there inline. My inner geek is satisfied!

If you look closely, you’ll see that in fact all of the properties are colored according to type: strings are cyan, numbers magenta, Booleans blue and other miscellany get a fetching green.

Hence the name: Literate Programming intersperses readable text with chunks of executable code; Literate Logging (okay, I confess I just made that up) intersperses the text with chunks of structured data.

You can WriteTo.LiterateConsole() by installing the Serilog.Sinks.Literate NuGet package. Enjoy!

(Know a .NET project with great console output? I’d love to check it out – let’s hear about it! :-))

↧

Seq 2.1 Refresh

June 22, 2015, 7:04 pm

≫ Next: Contextual logger injection for Autofac

≪ Previous: Contender for .NET’s Prettiest Console?

Since Seq 2.0 shipped a couple of weeks ago, enough small fixes and improvements have been collected to make a new point release.

Seq 2.1 is ready to download, and includes:

Send to app is back (#301)
Find with adjacent works in locales that use comma as a digit separator; enfin! (#194)
Downgrading to the free version now completes successfully even if the admin user has been deleted (#302)
TotalMilliseconds() enables parsing of .NET TimeSpan properties that have been serialized to strings (#152)
Refreshing the /login page now works as expected (#303)
Seq will now stop recording events when a (configurable) minimum disk space limit is reached – by default, 128 MB (#304)

There are a some minor performance and usability tweaks in there too, including some pixel shuffling for better in-browser scaling at 50, 75 and 90% on the events screen.

Hope you’re enjoying the new release! Comments and feedback welcome as always :-)

↧

Contextual logger injection for Autofac

June 23, 2015, 2:59 pm

≫ Next: Seq 2.2: Memory Efficiency, One-click Auto-refresh, Filter History

≪ Previous: Seq 2.1 Refresh

TL;DR: install package AutofacSerilogIntegration, call builder.RegisterLogger(), and use constructor injection of ILogger to get source type tagging.

When I use Serilog, I more often than not embrace the static Log class and use a pattern like:

class Example
{
readonly ILogger _log = Log.ForContext<Example>();

public void Show()
{
_log.Information("Hello!");
}
}

Notice in this example the ForContext<Example>() call. This creates a contextual logger that will tag all of the events created through it with the name of the specified class, in this case Example. Serilog does this by attaching a property called SourceContext.

SourceContext can be very useful for filtering. Here, using the Seq sink, we can see all events raised by PreadmissionController:

SourceContext

Logger Injection

When using an IoC container like Autofac, it’s quite common to inject a logger into the class using it. In this case our example would follow a slightly different pattern:

class Example
{
readonly ILogger _log;

public Example(ILogger log)
{
_log = log;
}

public void Show()
{
_log.Information("Hello!");
}
}

Here, instead of calling ForContext<Example>(), the expectation is that the appropriate logger will be passed into the constructor by the caller.

Though I’ve tended to move away from this pattern in recent years (if for nothing else, just to cut down the number of parameters on constructors!) I’ve encountered lots of codebases that use this tactic and find it works well.

Unfortunately, despite founding both the Autofac and Serilog projects, when asked how to set this up I’ve had to point people to Google and newsgroup posts, and don’t think there’s been a comprehensive example online showing how it can work efficiently and reliably. So, finally, I’ve posted a working integration on GitHub and published it as the AutofacSerilogIntegration NuGet package. Here’s how it’s used.

Setting up the Autofac/Serilog Integration

As with all good stories, this one begins at the NuGet package manager command-line:

Install-Package AutofacSerilogIntegration

(This assumes you’ve installed both the Autofac and Serilog packages already.)

The first thing you should do in your application is configure a logger. I recommend assigning this to Log.Logger even if you don’t intend to use the static Log class, just in case any calls to the static logger slip in accidentally.

Log.Logger = new LoggerConfiguration()
.WriteTo.ColoredConsole()
.CreateLogger();

Next up, where your Autofac ContainerBuilder is configured, call RegisterLogger(). This is an extension method in the AutofacSerilogIntegration namespace..

var builder = new ContainerBuilder();
builder.RegisterLogger();

And, that’s everything. For components created by the container, parameters of type ILogger will now be fulfilled using a logger instance tagged with the correct type.

The RegisterLogger() method accepts a couple of parameters – an ILogger can be specified if you want to use a root logger separate from the static one, and property injection can be optionally enabled.

Closing thoughts…

This is a shiny new slice of code and there may still be scenarios it doesn’t handle so well. The beauty of getting it to GitHub is that over time it can be improved through use – pull requests appreciated! I hope it saves you some searching next time you hit File > New Project.

↧

Seq 2.2: Memory Efficiency, One-click Auto-refresh, Filter History

July 13, 2015, 3:58 am

≫ Next: Set the asterisk in project.json version numbers

≪ Previous: Contextual logger injection for Autofac

It took some restraint to get Seq 2.0 over the line: there are so many possibilities that there’s really no way everything I’d like to have done could make the cut. Now the “big bang” is over, it’s great to be able to make more regular point releases like Seq 2.2, which this post is about.

Improved Memory Efficiency

Seq is a disk-backed data store that uses memory caching extensively. When data from older events does spill over from RAM, I/O is needed and queries slow down.

Seq 2.2 performs additional de-duplication of string data to represent more events in the same cache space. This reduces the need for I/O and places less overall burden on the CPU. The net effect is that queries can run noticeably faster on busy servers.

One-click “Auto-refresh on”

The much-used “Auto-refresh on” option has been promoted from a drop-down menu item to a top-level button. It’s represented by the little “infinity” icon in the image above.

AutorefreshButton

Recent Filter History

Seq 1.6 used deep-linking to tie the current filter expression to the web browser’s history. The basic idea was sound – it’s nice to enter a filter, try another, then press “Back” to go to the previous one.

In practice log navigation is so fluid that what you thought was the last filter often turns out to be a few clicks back, which ends up being a clunky back-back-back experience. Seq 2.0 therefore booted out back button support for filters, instead providing coarse-grained history between the dash, settings, and events screens (deep-linking of filters is still supported, but they don’t go on the browser’s back-stack).

Seq 2.2 brings back the notion of history in the guise of a “recent filters” drop-down you can see in the right of the filter box. Clicking on one of the history entries will set the filter to that text:

History

The release includes other small enhancements and bug fixes, listed in the complete release notes.

There will be more point releases in the 2.x cycle at about the same 2-4 week interval, but 2.2 really is one not to miss. You can download it now from the Seq website — let us know what you think!

↧

Set the asterisk in project.json version numbers

July 20, 2015, 4:08 am

≫ Next: Server Efficiency and “Seq App” Input Changes in Seq 2.3

≪ Previous: Seq 2.2: Memory Efficiency, One-click Auto-refresh, Filter History

I have a feeling I’ve bothered the friendly people on Jabbr twice now about how to set a value for the * (‘wildcard’) placeholder in DNX’s project.json files, so here it is for next time… :-)

DNX project.json files use a version syntax that makes it easy to set a tag (e.g. the branch name) in the JSON file itself, while adding a unique numeric suffix at build time.

Here’s one in the first property below:

{
"version": "1.0.0-beta-*",
"description": "Serilog provider for Microsoft.Framework.Logging",

(I’m setting up CI for a Serilog provider for Microsoft.Framework.Logging that the ASP.NET team put together and contributed to the project.)

These work both in the package’s own version field, and in dependency versions, so multiple projects being built at the same time can depend on each other this way.

However, if you build a project versioned this way in Visual Studio, or at the command line with dnu build you’ll get a version number like 1.0.0-beta. That’s not what you’re after.

The key is setting the DNX_BUILD_VERSION environment variable, e.g. to a counter supplied by your build server:

set DNX_BUILD_VERSION=%APPVEYOR_BUILD_NUMBER%

With this done you’ll get nice unique package versions like 1.0.0-beta-234.

Thanks David Fowler and Alex Koeplinger for the pointer.

↧

Server Efficiency and “Seq App” Input Changes in Seq 2.3

August 18, 2015, 12:52 am

≫ Next: Assigning event types to Serilog events

≪ Previous: Set the asterisk in project.json version numbers

Apps are Seq extensions that drive notifications like email and Slack, stream processing, and Seq’s inbuilt dashboard charts.

In earlier versions, apps used a persistent cursor into the event stream and a short buffering window to track delivery and sort incoming events by timestamp. This implementation was based on the assumption that many apps would want loosely-ordered input, and providing this in the server rather than each app individually would be most efficient.

The flipside of this decision was that ordered delivery and a persistent cursor for each app still required quite substantial resources. Seq servers with many dashboard charts or running apps could spend a large amount of CPU and I/O time updating them. This turned out to be largely a waste, given the infrequent need for ordering, and the limited benefits of a persistent cursor when the event delivery process is inherently unreliable.

Seq 2.3 gains some big performance wins by using in-memory delivery of incoming events to apps. The overhead of running apps in this mode is negligible, so the difference can be noticeable. A loaded test server running just five charts used close to 70% less CPU time on 2.3 than on 2.2.

Because most apps benefit from the new model, this is the default on 2.3 servers. Apps that might behave differently with this change can opt back into ordered delivery using a new setting Order events by timestamp.

Order-by-timestamp

Hopefully the new version will make your server smile! :-)

↧

Assigning event types to Serilog events

October 19, 2015, 4:11 am

≫ Next: Seq 2.4

≪ Previous: Server Efficiency and “Seq App” Input Changes in Seq 2.3

One of the most powerful benefits of structured logging is the ability to treat log events as though “typed”, so that events generated by the same logging statement can be easily (and mechanically) identified in the log stream.

Given a logging statement parameterized by some data:

var total = 1;
for(var i = 0; i < 3; ++i)
{
total *= i;
Log.Information("Computed iteration {Counter}, total is {Total}", i, total);
}

The text representation of each event (“Computed iteration 2, total is 4”) will be different.

A traditional text-based logging system necessitates the use of regular expressions to identify and parse messages created in the loop. This is a bigger problem than it sounds: once the event is interspersed through a large stream of unrelated messages this style of processing is both slow and error-prone, as well as inconvenient.

By contrast, a structured logger like Serilog or SLAB/ETW assigns ids or types to events, so that all events generated by the statement will carry a distinct type as well as the structured fields for Counter and Total. Queries using event types can find all of the events generated by a particular logging statement, even though their text representations may differ.

Enriching events with types

Serilog treats the message template itself as the event type. By attaching "Computed iteration {Counter}, total is {Total}" to each event, all events generated from the same template can be identified.

Long strings like this can be inconvenient to record and type, so it’s often useful to take a hash of this value instead, and record that as the event type alongside other data comprising the event. Seq does this automatically by assigning a type to Serilog events on the server-side. Using Elasticsearch to store events, you might achieve the same thing with a transform.

If you just want to have the convenience of searching by event type in regular flat log files, or if you’re using a log collector without this option, it’s easy to add support for it using a Serilog enricher:

Log.Logger = new LoggerConfiguration()
.Enrich.With<EventTypeEnricher>()
.WriteTo.LiterateConsole(
outputTemplate: "{Timestamp:HH:mm:ss} [{EventType:x8} {Level}] {Message}{NewLine}{Exception}")
.CreateLogger();

Line 2 adds the EventTypeEnricher that we’ll see below to the logging pipeline.

Line 3 shows the modified outputTemplate that includes the EventType value, in this case a 32-bit value formatted in hexadecimal. (This example writes to the Literate Console sink, which is a great way to visualize the structure of Serilog events even when they’re formatted into text.)

The enricher

In this example, event types are generated using a 32-bit Murmur3 hash. The relative merits of different hash algorithms and sizes for this purpose is a post in itself – we’ll just use a readily-available one. Anything from string.GetHashCode() to SHA-1 might work here, depending on your needs.

The algorithim is from this package, which you’ll need to install first from NuGet.

class EventTypeEnricher : ILogEventEnricher
{
public void Enrich(LogEvent logEvent, ILogEventPropertyFactory propertyFactory)
{
var murmur = MurmurHash.Create32();
var bytes = Encoding.UTF8.GetBytes(logEvent.MessageTemplate.Text);
var hash = murmur.ComputeHash(bytes);
var numericHash = BitConverter.ToUInt32(hash, 0);
var eventId = propertyFactory.CreateProperty("EventType", numericHash);
logEvent.AddPropertyIfAbsent(eventId);
}
}

The enricher retrieves the original message template text, computes its hash, and attaches it the event as the EventType property.

The output

Taking our example again, including a few different events:

Log.Information("Starting up");

var total = 1;
for (var i = 0; i < 3; ++i)
{
total += i;
Log.Information("Computed iteration {Counter}, total is {Total}", i, total);
}

Log.Information("All done");

The output is:

EventTypesLiterateConsole

Each loop of the iteration, despite carrying different values for Counter and Total, is tagged with the same event type f20ba6e0. The other two events carry their own distinct event types that identify them.

Summing up

Structured logging is a necessary response to the difficulty of sifting through log events from ever-larger, more distributed, more sophisticated applications. Alongside named property values, event types are a big part of the structured logging value proposition. You can use either raw message templates, or hashes of them, as types when working with Serilog.

If you’re using a solution that supports event types in your log data, it’d be awesome to hear how it has worked for you!

↧

Seq 2.4

October 31, 2015, 8:54 pm

≫ Next: Aggregate Queries in Seq Part 1: Goals

≪ Previous: Assigning event types to Serilog events

Hot off the press and ready to download from the Seq site.

Seq 2.4 is largely a maintenance release with a host of bug fixes, but amongst those are some substantial improvements.

Filtering performance improvements

Seq implements its own query engine (soon to get some interesting new capabilities ;-)) optimized for the kind of messy, ad-hoc filtering that we do over log data. Based on the kinds of queries generated through real-world use of the 2.0 “signals” feature, two great little optimizations have landed in 2.4.

First, an overdue implementation of short-circuiting && (AND) and || (OR). This just means that the expression @Level == "Error" && Environment == "UAT" won’t bother evaluating the second comparison if the first one is false. Seq has always done some short-circuiting in queries, but only in certain cases. 2.4 extends this to all logical operations.

Second, and closely related, is term reordering based on expected execution cost. Some predicates are extremely cheap and others costly, for example an event type comparison ($FEE7C01D) can be evaluated several thousand times faster than a full-document text search (Contains(@Document, "penguin"). This means that, given short-circuiting operations, $FEE7C01D && Contains(@Document, "penguin") is much more efficient than the reverse Contains(@Document, "penguin") && $FEE7C01D in the common case of mostly-negative results. Seq 2.4 uses heuristics to weigh up the relative execution cost of each branch of an expression to make sure the fastest comparisons are performed first.

Both of these changes add up to substantial gains when using signals with a large number of applied filters.

Restricted signals

Since Seq uses the signal mechanism for retention processing, it’s possible that an accidental change to a signal used in retention processing could lead to data loss. For this reason Seq 2.4 introduces lockable signals, requiring administrative privileges to modify.

Seq-2.4

Online compaction

Seq uses the ESENT storage engine to manage files on disk. It’s an amazing piece of technology and very mature, however until recently was unable to support compaction of data files during operation. Although retention policies would remove events from the file, Seq would periodically need to take a slice of the event stream offline to free the empty space in the file, and this process copies the old file into a new one. Mostly this background operation would be quick and transparent, but on heavily loaded servers the disk space and I/O required would sometimes significantly impact performance.

The new version 2.4, when running on a capable operating system (Windows 8.1+ or Server 2012 R2+), now takes advantage of ESENT’s sparse file support to perform compaction in real time, spreading out the load and avoiding additional disk usage and I/O spikes.

You can download the new version here on the Seq website.

Happy logging! :-)

↧

Aggregate Queries in Seq Part 1: Goals

November 2, 2015, 1:32 am

≫ Next: Aggregate Queries in Seq Part 2: Defining a Syntax

≪ Previous: Seq 2.4

To add a bit of variety to the format of this blog, I’ve decided to try diarising a month of programming – November 2015 to be exact, if you’re reading this in the future!

This month I’ve got some steep goals to face: I want to ship a preview of Seq’s next major feature – aggregate queries – by the end of the month. I’m not starting from scratch, but pulling together the current progress into a complete feature is still a lot of work and there are many design decisions yet to make. I don’t intend to post an update every day (I’d have no time for actually writing the code ;-)) but hopefully every few days I can get an installment up here.

So, the first of these diary entries: why am I even working on aggregate queries, and what are they, anyway?

Constraint is a wonderful aid to creation, since without the months-end deadline breathing down my neck I’d no doubt have more to say about this here, but in the interest of making progress, it’s quicker and easier to explain by example: the aggregates we’re talking about are count(), distinct(), sum(), min(), max(), mean(), percentile() and some of their lesser-known friends.

Log data is great for answering ad hoc questions about how and app behaves and is used. A big enhancement to Seq’s analytical capabilities today (which otherwise fall back on exporting tabular data to Excel) would be to ask it questions like:

Which exception types have occurred today, and how many of each type?
Are average transaction processing times improving or degrading?
How many items on average do customers check out?

Aggregate queries enable this, and up all kinds of ways to learn more from the data that’s already collected.

Some of these capabilities overlap with what dedicated metrics can also provide. I am a huge believer in the benefits of measuring and dashboarding anything that moves. Metrics and logs aren’t the same thing though, and the scenarios and usage patterns for each can be startlingly different, from collection right through to storage and processing. Seq can be (and already is) used for very light metrics duties, but in the interest of doing one thing well the immediate goal for aggregation in Seq is to answer ad hoc questions from log data rather than perform heavy-duty timeseries crunching.

Implementing aggregates in Seq means implementing from the ground, up. There’s no SQL database behind the scenes to do the heavy lifting – everything from parsing to planning and executing the queries needs to be done by hand in C#. I’m expecting to learn a lot along the way. It should make for an interesting month, wish me luck! :-)

↧

Aggregate Queries in Seq Part 2: Defining a Syntax

November 3, 2015, 2:35 pm

≫ Next: Aggregate Queries in Seq Part 3: An Opportunistic Parser

≪ Previous: Aggregate Queries in Seq Part 1: Goals

So, before we go any farther we’re going to need to pin down a bit more tightly what form aggregate queries will take. There are options, options, options – but hopefully a lot will fall out of how queries are expressed in Seq today.

The Seq filter box accepts predicates – expressions that evaluate to a Boolean in the context of an event.

Environment == "Production"

To express something like a sum, writing:

sum(ItemsOrdered)

…in the filter box seems reasonable, except when predicates get involved again. Combining the two expressions above into something that says “the number of items ordered in the production environment” is not obvious.

To introduce aggregates the filter syntax is going to have to stretch a bit, so that Seq can tell the difference between a simple predicate and a full-fledged query. Here goes:

select sum(ItemsOrdered) where Environment == "Production"

The plan is to reappropriate select and where from SQL to mark out the clauses. SQL-like queries have the strong advantage being familiar, and loosely align with the rest of the “C#-like” syntax by their analogy to LINQ. Starting queries with the keyword select gives the UI a chance to intelligently determine the type of query being written – staying with just a single input box is an explicit goal.

So what else can a SQL-like syntax offer? Grouping the number of items ordered by the item itself:

select sum(ItemsOrdered)
where Environment == "Production"
group by ItemId

Groupings are bread-and-butter for aggregate queries, so it’s handy that they carry over fairly naturally.

What about from? I don’t think I’m going to go after from at this point. The context of a query in Seq will initially be the events viewed in the UI, filtered down to whatever signals are active. There’s lots of room to extend this down the track a bit, but I think filtering and grouping is enough to bite off for a start.

There’s one last core concept the query syntax needs. Log events are time-dimensioned, and dealing with time requires some up-front attention.

Poking around, time groupings seem to have been grafted onto SQL in a few different ways, in traditional databases, event processing systems and timeseries databases. Approaching this the obvious way by making use of the built-in @Timestamp property attached to Seq events could look like:

select sum(ItemsOrdered)
where Environment == "Production"
group by hour(@Timestamp), ItemId

The awkwardness of this approach isn’t apparent until more exotic requirements show up – grouping by 20 hour blocks, or offsetting queries into another (non-UTC) timezone. I’m also not sure I want to type @Timestamp dozens of times a day.

Instead, I’m exploring the idea of a “time expression” syntax like the one used by InfluxDB, where the size of the interval is specified as a literal like time(10s):

select sum(ItemsOrdered)
where Environment == "Production"
group by time(1h), ItemId

Melding this to the existing expression parser is going to be fun!

↧

Aggregate Queries in Seq Part 3: An Opportunistic Parser

November 5, 2015, 1:37 pm

≫ Next: Aggregate Queries in Seq Part 4: Planning

≪ Previous: Aggregate Queries in Seq Part 2: Defining a Syntax

It turns out the parser wasn’t a huge departure from Seq’s existing filter parser. Seq already uses Sprache to parse filter expressions, and Sprache parsers compose very nicely.

After making the current FilterExpressionParser “root” expression public, and defining some new AST nodes like Projection and so-on, things just get bolted together:

static readonly Parser<ExpressionValue> ExpressionValue =
FilterExpressionParser.Expr.Token().Select(e => new ExpressionValue(e));

static readonly Parser<Projection> Projection =
from v in AggregateValue.Or(ExpressionValue)
from l in Label.Optional()
select new Projection(v, l.GetOrDefault());

Here you can see the way a projection like the count(*) as Total column is constructed from a parser for values, and a parser for optional ‘as’ labels. I had to define a separate parser for some aggregations, like count(*) that aren’t otherwise valid Seq filter syntax, but any existing expression that FilterExpressionParser supports can be used as the value of a projected column.

Heading further up towards the root of the grammar, we get something like:

static readonly Parser<Query> Query =
from @select in Select
from @where in Where.XOptional()
from groupBy in GroupBy.XOptional()
select new Query(@select, @where.GetOrDefault(), groupBy.GetOrDefault());

The resulting Query parser can take some text input and give back a tree of objects representing the parts of the query. Success!

There was one subtle problem here that you can spot by way of the oddly-named XOptional combinator. Sprache on Github provides Optional, which works as advertised, but upon failing a match will backtrack and return success regardless of whether a partial parse was possible or not.

This leads to error messages without a lot of information, for example:

select distinct(ExceptionType) group ApplicationName

is missing the ‘by’ required next to ‘group’. Using Optional the parser reports:

Syntax error (col 31): unexpected 'g'.

Hmmm. Not so good – there’s nothing at all wrong with that ‘g’! The problem is that upon failing to parse the ‘by’, Sprache’s Optional returned a zero-length successful parse, so parsing picks back up at that position and fails because there are no more tokens to match.

The ‘X’ in XOptional is for eXclusive, meaning that the token is optional, but, only if it parses no input whatsoever. As soon as ‘group’ is parsed, the optional branch is considred “taken”, and failures will propagate up. (Sprache ships ‘X’ versions of several parsers already, such as an exclusive Many called XMany.)

Here it is:

public static Parser<IOption<T>> XOptional<T>(this Parser<T> parser)
{
if (parser == null) throw new ArgumentNullException(nameof(parser));
return i =>
{
var result = parser(i);
if (result.WasSuccessful)
return Result.Success(new Some<T>(result.Value), result.Remainder);

if (result.Remainder.Equals(i))
return Result.Success(new None<T>(), i);

return Result.Failure<IOption<T>>(result.Remainder, result.Message, result.Expectations);
};
}

The divergence from the built-in optional is only succeeding with a zero-length parse if (result.Remainder.Equals(i)).

Using XOptional:

Syntax error (col 38): unexpected 'A', expected keyword 'by'.

Better!

If you haven’t used parser combinators before this whole thing might be a bit surprising – where’s the EBNF? The esoteric command line tools with animal names? It turns out that combinators make parsing into a regular (somewhat imperative) programming task without a lot of mystery surrounding it.

There are some limitations in Sprache’s implementation I’d like to address someday – for example, the error reported on ‘A’ above rather than ‘ApplicationName’ is the result of parsing the raw character stream instead of a tokenised one – but these are minor inconveniences that can be worked around if need be.

If you haven’t looked into combinator-based parsing, there are some great tutorials and examples linked from Sprache’s README. It’s a technique worth adding to your tool belt regardless of the kind of programming you usually do. Little languages are everywhere, waiting to be cracked open!

The most enjoyable and challenging part of any language processing task for me is not so much the parsing though, but taking a tree of syntactic nodes like we have here, and turning it into something executable. That’s coming up next :-)

↧

Aggregate Queries in Seq Part 4: Planning

November 11, 2015, 1:17 pm

≫ Next: Aggregate Queries in Seq Part 5: Execution

≪ Previous: Aggregate Queries in Seq Part 3: An Opportunistic Parser

Seq is a log server designed to collect structured log events from .NET apps. This month I’m working on adding support for aggregate queries and blogging my progress as a diary here. This is the fourth installment – you can catch up on Goals, Syntax, and Parsing in the first three posts.

So, this post and the next are about “planning” and “execution”. We left off a week ago having turned a query like:

select count(*)
where ApplicationName == "Admissions"
group by time(1d), ExceptionType

Into an expression tree:

QueryAST

Pretty as it looks, it’s not obvious how to take a tree structure like this, run it over the event stream, and output a rowset. We’ll break the problem into two parts – creating an execution plan, then running it. The first task is the topic of this post.

Planning

In a relational database, “planning” is the process of taking a query and converting it into internal data structures that describe a series of executable operations to perform against the tables and indexes, eventually producing a result. The hard parts, if you were to implement one, seem mostly to revolve around choosing an optimal plan given heuristics that estimate the cost of each step.

Things are much simpler in Seq’s data model, where there’s just a single stream of events indexed by (timestamp, arrival order), and given the absence of joins in our query language so far. The goal of planning our aggregate queries is pretty much the same, but the target data structures (the “plans”) only need to describe a small set of pre-baked execution strategies. Here they are.

Simple projections

Let’s take just about the simplest query the engine will support:

select MachineName, ThreadId

This query isn’t an aggregation at all: it doesn’t have any aggregate operators in the list of columns, so the whole rowset can be computed by running along the event stream and plucking out the two values from each event. We’ll refer to this as the “simple projection” plan.

A simple projection plan is little more than a filter (in the case that there’s a where clause present) and a list of (expression, label) pairs representing the columns. In Seq this looks much like:

class SimpleProjectionPlan : QueryPlan
{
public FilterExecutionPlan Filter { get; }
public ProjectedColumn[] Columns { get; }

public SimpleProjectionPlan(
ProjectedColumn[] columns,
FilterExecutionPlan filter = null)
{
if (columns == null) throw new ArgumentNullException(nameof(columns));
Columns = columns;
Filter = filter;
}
}

We won’t concern ourselves much with FilterExecutionPlan right now; it’s shared with Seq’s current filter-based queries and holds things like the range in the event stream to search, a predicate expression, and some information allowing events to be efficiently skipped if the filter specifies any required or excluded event types.

Within the plan, expressions can be stored in their compiled forms. Compilation can’t be done any earlier because of the ambiguity posed by a construct like max(Items): syntactically this could be either an aggregate operator or a scalar function call (like length(Items) would be). Once the planner has decided what the call represents, it can be converted into the right representation. Expression compilation is another piece of the existing Seq filtering infrastructure that can be conveniently reused.

Aggregations

Stepping up the level of complexity one notch:

select distinct(MachineName) group by Environment

Now we’re firmly into aggregation territory. There are two parts to an aggregate query – the aggregates to compute, like distinct(MachineName), and the groupings over which the aggregates are computed, like Environment. If there’s no grouping specified, then a single group containing all events is implied.

class AggregationPlan : QueryPlan
{
public FilterExecutionPlan Filter { get; }
public AggregatedColumn[] Columns { get; }
public GroupingInstruction[] Groupings { get; set; }

public AggregationPlan(
AggregatedColumn[] columns,
GroupingInstruction[] groupings,
FilterExecutionPlan filter = null)
{
if (columns == null) throw new ArgumentNullException(nameof(columns));
if (groupings == null) throw new ArgumentNullException(nameof(groupings));
Filter = filter;
Columns = columns;
Groupings = groupings;
}
}

This kind of plan can be implemented (naiively perhaps, but that’s fine for a first-cut implementation) by using the groupings to create “buckets” for each group, and in each bucket keeping the intermediate state for the required aggregates until a final result can be produced.

Aggregated columns, in addition to the expression and a label, carry what’s effectively the constructor parameters for creating the bucket to compute the aggregate. This isn’t immediately obvious based on the example of distinct, but given another example the purpose of this becomes clearer:

percentile(Elapsed, 95)

This expression is an aggregation producing the 95th percentile for the Elapsed property. An AggregatedColumn representing this computation has to carry the name of the aggregate ("percentile") and the argument 95.

Time slicing

Finally, the example we began with:

select count(*)
where ApplicationName == "Admissions"
group by time(1d), ExceptionType

Planning this thing out reveals a subtlety around time slices in queries. You’ll note that the time(1d) group is in the first (dominant) position among the grouped columns. It turns out the kind of plan we need is completely different depending on the position of the time grouping.

In the time-dominant example here, the query first breaks the stream up into time slices, then computes an aggregate on each group. Let’s refer to this as the “time slicing plan”.

class TimeSlicingPlan : QueryPlan
{
public TimeSpan Interval { get; }
public AggregationPlan Aggregation { get; }

public TimeSlicingPlan(
TimeSpan interval,
AggregationPlan aggregation)
{
if (aggregation == null) throw new ArgumentNullException(nameof(aggregation));
Interval = interval;
Aggregation = aggregation;
}
}

The plan is straightforward – there’s an interval over which the time groupings will be created, and an aggregation plan to run on the result.

The output from this query will be a single time series at one-day resolution, where each element in the series is a rowset containing (exception type, count) pairs for that day.

The alternative formulation, where time is specified last, would produce a completely different result.

select count(*)
where ApplicationName == "Admissions"
group by ExceptionType, time(1d)

The output of this query would be a result set where each element contains an exception type and a timeseries with counts of that exception type each day. We’ll refer to this as the “timeseries plan”.

Both data sets contain the same information, but the first form is more efficient when exploring sparse data, while the second is more efficient for retrieving a limited set of timeseries for graphing or analytics.

To keep things simple (this month!) I’m not going to tackle the timeseries formulation of this query, instead working on the time slicing one because I think this is closer to the way aggregations on time will be used in the initial log exploration scenarios that the feature is targeting.

Putting it all together

So, to recap – what’s the planning component? For our purposes, planning will take the syntax tree of a query and figure out which of the three plans above – simple projection, aggregation, or time slicing – should be used to execute it.

The planner itself is a few hundred lines of fairly uninteresting code; I’ll leave you with one of the tests for it which, like many of the tests for Seq, is heavily data-driven.

[Test]
[TestCase("select MachineName", typeof(SimpleProjectionPlan))]
[TestCase("select max(Elapsed)", typeof(AggregationPlan))]
[TestCase("select MachineName where Elapsed > 10", typeof(SimpleProjectionPlan))]
[TestCase("select StartsWith(MachineName, \"m\")", typeof(SimpleProjectionPlan))]
[TestCase("select max(Elapsed) group by MachineName", typeof(AggregationPlan))]
[TestCase("select percentile(Elapsed, 90) group by MachineName", typeof(AggregationPlan))]
[TestCase("select max(Elapsed) group by MachineName, Environment", typeof(AggregationPlan))]
[TestCase("select distinct(ProcessName) group by MachineName", typeof(AggregationPlan))]
[TestCase("select max(Elapsed) group by time(1s)", typeof(TimeSlicingPlan))]
[TestCase("select max(Elapsed) group by time(1s), MachineName", typeof(TimeSlicingPlan))]
[TestCase("select count(*)", typeof(AggregationPlan))]
public void QueryPlansAreDetermined(string query, Type planType)
{
var tree = QueryParser.ParseExact(query);

QueryPlan plan;
string[] errors;
Assert.IsTrue(QueryPlanner.Plan(tree, out plan, out errors));
Assert.IsInstanceOf(planType, plan);
}

Part five will look at what has to be done to turn the plan into a rowset – the last thing to do before the API can be hooked up!

↧

Aggregate Queries in Seq Part 5: Execution

November 22, 2015, 2:44 pm

≫ Next: Aggregate Queries in Seq Part 6: Serving Data

≪ Previous: Aggregate Queries in Seq Part 4: Planning

Part 5 was very nearly the stalling point in this blog series. I’ve got enough of the implementation done that I can see the finish line, and I’m eager to get that build out, but to really finish the story I need to fill in this installment. If this post is a little brief, please read it as a “status report” this time around :-)

I’ve also had a bit of time now to revisit decisions made in the earlier stages of building this feature. I had some honest and valuable feedback from Michael Chandler on Twitter regarding the “SQL-like” nature of the syntax:

@nblumhardt Just wondering if you can take more inspiration from elsewhere – maybe Lucene and RavenDB? Hybrid C#/SQL language is weird?

— Michael Chandler (@optiks) November 5, 2015

Upon reflection I think it will be easier to explain how to use aggregate queries if the language simply is SQL, or a dialect thereof, anyway. So, I’ve been back to rework some of the parser and now in addition to the C#-style expression syntax, typical SQL operators such as =, and, or, like, and not as well as single-quoted strings are available:

select count(*)
where Environment = 'production' and not has(@Exception)
group by time(7d), Application

There are still some questions to answer around how much this flows back the other way into typical filter expressions. On the one hand, it’d be nice if the filter syntax and the where clause syntax were identical so that translating between queries and filters is trivial. On the other hand, keeping the languages a bit tighter seems wise. For now, the syntaxes are the same; I’m going to spend some time using the SQL syntax in filters and see how it goes in practice.

Anyway, back to the topic at hand. Now we’re getting somewhere! The aggregate query parser handles the syntax, the planner can produce a query plan, and we need to turn that into a result set.

This post considers three questions:

What inputs are fed into the executor?
What does the result set look like?
How is the result computed?

The first turns out to be predictably easy, given the efforts expended so far to generate a plan, and the existing event storage infrastructure. Keeping things as simple as possible:

static class QueryExecutor
{
public static QueryResult Execute(
QueryPlan plan,
IEventStore store,
DateTime rangeStartUtc,
DateTime rangeEndUtc)

Here plan is the output of the last step, store is a high-level interface to Seq’s time-ordered disk/RAM event storage, and the two range parameters the time slice to search. (The implementation as it stands includes a little more detail, but nothing significant.)

A QueryResult lists the column names produced, and either a list of rows, or a list of time slices that each carry a list of rows:

Results

I decided for now to keep the concept of “time slice” or sample separate (time could simply have been another column in the rowset) because it makes for a friendlier API. I’m not sure if this decision will stick, since tabular result sets are ubiquitously popular, but when “series” are added as a first-class concept it is likely they’ll have their own more optimal representation too.

In between these two things – an input query plan and an output result – magic happens. No, just kidding actually. It’s funny how, once you start implementing something, the magic is stripped away and things that initially seem impenetrably complex are made up of simple components.

The core of the query executor inspects events one by one and feeds the matching ones into a data structure carrying the state of the computation:

AggregationState

First, the group that the event belongs to is determined by calculating each grouping expression and creating a group key. Against this, state is stored for each aggregate column being computed. The subclasses of Aggregation are themselves quite simple, like count():

class CountAggregation : Aggregation
{
long _count;

public override void Update(object value)
{
if (value == null)
return;

++_count;
}

public override object Calculate()
{
return (decimal)_count;
}
}

The value to be aggregated is passed to the Update() method (in the case of count(*) the * evaluates to a non-null constant) and the aggregation adds this to the internal state.

Once all of the events have been processed for a time range, Calculate() is used to determine the final value of the column. It’s not hard to map count(), sum(), min(), max() and so-on to this model.

Things are a little trickier for aggregates that themselves produce a rowset, like distinct(), but the basic approach is the same. (Considering all aggregate operators as producing a rowset would also work and produce a more general internal API, but the number of object[] allocations gets a little out of hand.)

Once results have been computed for a time slice, it’s possible to iterate over the groups that were created and output rows in the shape of the QueryResult structure shown earlier.

There’s obviously a lot of room for optimisation, but the goals of a feature spike are to “make it work” ahead of making it work fast, so this is where things will sit while I move on towards the UI.

One more thing is nagging at me here. How do we prevent an over-eager query from swamping the server? Eventually I’d like to be able to reject “silly” queries at the planning stage, but for now it’s got to be the job of the query executor to provide timeouts and cancellation. These are sitting in a Trello card waiting for attention…

In the next post (Part 6!) we’ll look more closely at Seq’s API and finally see some queries in action. Until then, happy logging!

↧

Aggregate Queries in Seq Part 6: Serving Data

December 10, 2015, 2:33 pm

≫ Next: How to notify Slack using logs from your .NET app

≪ Previous: Aggregate Queries in Seq Part 5: Execution

Let’s get it out of the way up front – I didn’t manage to fit this series into November. Hrmmmmm… sorry! In the end, I prioritised getting an early preview release out ahead of finishing the blog series documenting the process. The upside is – you can grab it now! The basics of aggregate queries, as they’ll appear in Seq 3, are in preview form on the Seq downloads page.

In what will be the final post in this series for now, I want to show you how the results of running an aggregate query are surfaced in Seq’s HTTP API, and also give a shout-out to some handy tools and techniques it uses along the way.

A syntactic digression…

Before we dive into how Seq’s API is structured, there’s one small tweak to the “SQL-like” query language that I should mention. To avoid endless confusion about what that “-like” really means, the new syntax added in Seq 3 will, as much as possible, be a dialect of SQL. Introducing fewer new things means everyone has less to hold in their head while using Seq, which fits Seq’s goal of getting out of the way while you navigate logs.

The most noticeable change here is that all queries (except trivial scalar ones such as select 41 + 1 as Answer) now have a from clause of the form from stream. The “stream” is Seq’s way of describing “whichever stream of events you’re currently looking at”. Someday other resources wll be exposed through the query interface, and the from clause will enable that.

Other changes you’ll spot are support for SQL operators such as and, or, not and even like. Single-quoted strings are supported, as are SQL comparisons = and <> (though a bug in the published build prevents the use of the latter).

By aligning better with SQL I hope Seq’s querying facilities can remain easy to learn while evolving to support more sophisticated uses.

Resources and Links

Back to the API. Seq is developed API-first, a strategy I picked up from Paul Stovell while working on Octopus Deploy, and something that’s made a huge impact on the way I’ve approached web application development ever since.

Seq also employs some of the ideas associated with hypermedia APIs, notably links and URI templates. (In my experiences so far these are the techniques, along with semantic use of HTTP verbs, that have delivered the most value in application development. Some of the more sophisticated hypermedia concepts like custom media types for standard interchange formats are very interesting but I’m not seeing much use day-to-day. Having said that, wurl is one tool I’ve seen that made me wish we all did use HAL or JSON-API.)

Linking means that the entire Seq API can be navigated from a single root document. At https://your-seq-server/api (here if you’re on localhost) there’s a JSON document with links into all the various resources that the server provides.

API Root

The green box (shout out to Greenshot) shows the resource endpoint where queries are going to be exposed as data sets.

My test Seq instance is configured to listen under the /prd URL prefix, but unless you’ve set this up explicitly you won’t see this part of the path on your own instance.

Data Resources

I’ll admit that there isn’t much of hypermedia angle in this particular use case – rowsets aren’t obviously entities the way signals, users and event events are in Seq’s world-view – but using the same machinery for this as the remainder of the API keeps everything working harmoniously.

The awesome thing from a consumer standpoint though, is the self-documenting nature of the whole thing. Note the "Query" link in the image above. This is a URI template describing how calls to the query endpoint should be formatted.

GET http://your-seq-server/api/data?q=select%20count(*)%20from%20stream&rangeStartUtc=2015-12-08T22:20:22.000Z HTTP/1.1

The signalId segment of the URI is optional, so it doesn’t appear in this request.

In the JavaScript client the call looks like:

api.data.query({ q: 'select count(*) from stream', rangeStartUtc: start }).then(rowset => {
// rowset is the query result
});

Here, if the client did specify a signalId, the URI template ensures it would be formatted into a URL segment rather than being specified as a query string parameter like the other parameters are, but the client code doesn’t have to be aware of the difference. This makes it nice and easy to refactor and improve the URL structure of the API (even during development) without endlessly poking around in JavaScript string concatenation code.

On the client side, URI templates are handled with the Franz Antesberger’s JavaScript implementation. For the most part this means a hash of parameters like the argument passed to query above can be substituted directly into the URI template, and validated for correct naming and so-on along the way.

Serving it up with Nancy

On the server side, NancyFX reigns. Like Octopus Deploy, Seq took a bet on Nancy in its pre-1.0 days and hasn’t looked back. The team behind Nancy is talented, passionate, but above all, highly considerate of its users to the extent that since adopting Nancy in the zero-point-somethings there’s barely been any breakage or churn between versions, let alone bugs. I can’t recommend Nancy highly enough, and consider it the gold standard for building web apps in .NET these days. It looks like I’m not alone in this opinion.

I find Nancy wonderful for building APIs because while it exposes a slightly quirky API of its own (a noble thing!), it isn’t at all opinonated about how you should structure yours. This also means that when dealing with Nancy there are sensible defaults everywhere, but very little deeply built-in policy – so minimal time is spent grappling with the framework itself.

Here’s the route that handles queries:

public class DataModule : SignalContentModule
{
readonly Lazy<IEventStore> _events;

public DataModule(Lazy<IEventStore> events)
: base("data")
{
_events = events;

Post[""] = p => Query(SignalsModule.MapToNewDocument(ReadBody<SignalEntity>()));
Get["/{signalId}"] = p => Query(LoadAndCheck(p.signalId));
}

This is a Nancy module that shares some functionality with the events module (though implementation inheritance does feel like a bit of an ugly hack, here as elsewhere). The two different instantiations of the Query URI template we viewed before need two routes in Nancy; the POST version accepts an unsaved signal in the body of the request, which is necessary because signals may be edited in the UI and queries run against them without saving the changes.

There’s one snippet responsible for declaring how the module works:

protected override ResourceGroup DescribeResourceGroup()
{
var resource = base.DescribeResourceGroup();
resource.Links.Add("Query", Qualify("{/signalId}{?q,intersectIds,rangeStartUtc,rangeEndUtc}"));
return resource;
}

(I’ve often thought it would be nice to unify the description with the routing – there’s some duplication between this code and the route configuration above.)

I’ll include for you here the whole implementation of Query() in all its SRP-violating glory (I think the un-refactored version is nicer to read sequentially in a blog post, but as this goes from feature spike to a fully-fledged implementation I see some Ctrl+R Ctrl+M in the very near future):

Response Query(Signal signal = null)
{
var query = (string)Request.Query.q;
if (string.IsNullOrWhiteSpace(query))
return BadRequest("A query parameter 'q' must be supplied.");

DateTime? rangeStartUtc = TryReadDateTime(Request.Query.rangeStartUtc);
DateTime rangeEndUtc = TryReadDateTime(Request.Query.rangeEndUtc) ?? DateTime.UtcNow;
if (rangeStartUtc == null)
return BadRequest("A from-date parameter 'rangeStartUtc' must be supplied.");

if (rangeStartUtc.Value >= rangeEndUtc)
return BadRequest("The queried time span must be of nonzero duration.");

var filter = GetIntersectedSignalsFilter(signal);

var result = _events.Value.Query(
query,
rangeStartUtc.Value,
rangeEndUtc,
filter: filter);

if (result.HasErrors)
{
var response = Json(new QueryResultPart
{
Error = "The query could not be executed.",
Reasons = result.Errors
});
response.StatusCode = HttpStatusCode.BadRequest;
return response;
}

var data = new QueryResultPart
{
Columns = result.Columns,
Rows = result.Rowset,
Slices = result.Slices?.Select(s => new TimeSlicePart
{
Time = s.SliceStartUtc.ToIsoString(),
Rows = s.Rowset
}).ToArray(),
Statistics = new QueryExecutionStatisticsPart
{
ElapsedMilliseconds = result.Statistics.Elapsed.TotalMilliseconds,
MatchingEventCount = result.Statistics.MatchingEventCount,
ScannedEventCount = result.Statistics.ScannedEventCount,
UncachedSegmentsScanned = result.Statistics.UncachedSegmentsScanned
}
};

return Json(data);
}

I have come to wonder if anyone out there uses optional = null parameters as a semi-self-documenting “nullable” annotation for parameters like this:

Response Query(Signal signal = null)
{

I picked it up as a habit after Daniel Cazzulino (if my memory serves me correctly) suggested it as a way of marking optional dependencies in Autofac. Using a default value for nullable arguments expresses the intent that ? would, had nullability been first-class in the early days of C#.

The block of code all the way through to the initialization of filter slurps up parameters from the query string. Nancy has a binding system that might knock a line or two of code out here.

The real action is in:

var result = _events.Value.Query(
query,
rangeStartUtc.Value,
rangeEndUtc,
filter: filter);

IEventStore.Query() is a wrapper method where the parsing, planning and execution steps take place.

Finally, the result is mapped back onto some POCO types to send to the caller. No magic here – but then again, I think the theme of these few blog posts has been to cut through the whole implementation in an anti-magical way. The types like QueryResultPart will eventually make their way to the Seq API client.

And that, my friends, is a wrap! It’s been fun sharing the process of spiking this feature. I’d like to continue this series into the Seq UI, but there’s a lot to do in the next few months as Seq 3 comes together into a shippable product. I’m excited about how much aggregate queries will enable as part of that. In the meantime though I need to report on the progress that’s been happening in Serilog in preparation for cross-platform CoreCLR support and ASP.NET 5 integration – look out for an update on that here soon.

I’ll leave you with a snapshot of an aggregate query in action.

Screenshot

Download the 3.x preview installer here and check out the documentation here.

↧

How to notify Slack using logs from your .NET app

December 17, 2015, 3:32 am

≫ Next: 2015 in Review

≪ Previous: Aggregate Queries in Seq Part 6: Serving Data

If your team uses Slack, it’s a great place to centralize notifications that might otherwise end up cluttering your email inbox. Commits, build results, deployments, incoming bug reports – Slack keeps your team informed without making everyone feel overloaded with information, which I why I think I like it so much.

The next step down the road to notification heaven, after setting up integrations for the third party apps and services you use, is to integrate your own apps into Slack.

Doing this directly – posting to the Slack API when interesting things happen – is a bit too much coupling, but if your app writes log events these can be used to trigger notifications in Slack with very little effort.

EventInSlack

So that Slack isn’t flooded with irrelevant events, we’ll forward them to Seq first so that interesting events can be selected from the stream.

1. Write and send the log events

In the demo app below, Serilog is configured to send events both to Seq and the local console.

First up install some prerequisite packages:

Install-Package Serilog
Install-Package Serilog.Sinks.Seq
Install-Package Serilog.Sinks.Literate

We’re using Serilog’s Literate Console sink because it reveals more about the structure of the log events (using colour) than the standard console can.

Here’s the code:

class Program
{
static void Main()
{
Log.Logger = new LoggerConfiguration()
.WriteTo.Seq("http://localhost:5341")
.WriteTo.LiterateConsole()
.CreateLogger();

Log.Information("Starting up");

var rng = new Random();
while (true)
{
var amount = rng.Next() % 100 + 1;

Log.Information("Received an order totalling ${Amount:0.00}", amount);

Thread.Sleep(1000);
}
}
}

This program generates a random integer between 1 and 100 each second, and writes this to the log as a property called Amount. Let’s imagine we’re processing sales on a website, or reading measurements from a sensor – the same approach covers a lot of scenarios.

Console

2. Install Seq

If you don’t have Seq running on your development machine, install it from the Seq downloads page – just click through the installer dialog and choose “Browse Seq” at the end to view these log events.

Seq

3. Choose events to notify on

Here’s a twist; so that we’re not overwhelmed with notifications we’ll only raise one if the value of the “sale” is above $90. To find these events in Seq, we first write the filter Amount >= 90. Once we’re confident the right events are selected, the filter can be saved as a signal.

CreatingSignal

The name of the signal is important since we’ll use it to configure the Slack integration in a moment.

4. Add the Slack integration for Seq

The Slack integration for Seq is developed and is maintained on GitHub by David Pfeffer. Seq plug-in apps are published on NuGet – this one has the package id Seq.App.Slack.

To install it into your Seq instance go to Settings, then Apps, and choose Install from NuGet. Enter the package name Seq.App.Slack and install.

Installing

A few seconds later the app should appear in your app list and be ready to configure. To the right of the app name, choose Start new instance….

Installed

5. Configure the WebHook

Give the instance a name like “Big Sales Incoming!” and un-check Only trigger the app manually. The name of the signal we created earlier should now be there in a drop down to select.

AppSetup

The last thing to set is the WebHook URL setting. The easiest way to get one of these is to open the channel you’re posting to in Slack and choose + Add an app or custom integration. This will take you to a Slack page which at the time of writing has just gone through a major overhaul. The current path through the site to add a webhook is:

Choose Build your Own in the top right-hand corner of the page
Under Something just for my team choose Make a Custom Integration
Here’s where you can choose Incoming WebHooks and follow a couple of prompts to get the URL

It’s a bit of an obscure way to do something that’s fairly common – I’m hopeful this will be improved once the redesign settles in :-)

Back to Seq, paste in the URL, Save Changes and you’re good to go! Events will now start appearing in Slack whenever the Amount is $90 or more.

EventInSlack

Happy logging!

↧

2015 in Review

December 26, 2015, 4:29 pm

≫ Next: Serilog 2.0 Progress Update

≪ Previous: How to notify Slack using logs from your .NET app

Another year is almost over! I’ve looked back through the posts on this blog from the past year and made some notes in case you’re looking for some holiday reading.

This was a huge year for Serilog. January kicked of with a few tens of thousands on the NuGet package download counter, but closing up the year it’s over 250k and accelerating. With Serilog-style message template support set to land in ASP.NET 5, I think it is safe to say 2016 is the year we’ll see structured logging hit mainstream in .NET.

Seq has also seen huge (triple-digit) growth this year, especially since v2 shipped in June. Keeping up has been a challenge! Along with a new major version in the first quarter next year, there’s a lot coming for us in 2016 – stay tuned for some updates after the break.

2015

Give your instrumentation some love in 2015! — I started this year aware that the vast majority of .NET teams are still writing plain-text logs, collecting them with Remote Desktop and reading them in Notepad. It feels like this is improving but there’s still a long way to go before we’re all using the current generation of tools effectively.
Using Serilog with F# Discriminated Unions — Serilog gained some better F# support this year. (Also on the F# front, Adam Chester’s implementation of Message Templates in F# has opened up some possibilities with that language. Logary 4 also promises some Serilog-style structured goodness for F# users sometime in the coming year.)
Tagging log events for effective correlation — Some tips for tracing related paths of execution through your application logs.
Diagnostic logging in DNX/ASP.NET 5 — The ASP.NET 5/CoreCLR platform has changed significantly since this first tentative post describing Serilog support went out in May, but the fundamentals are still pretty well summed-up here. ASP.NET 5 and CoreCLR are the bit focus areas for Serilog’s upcoming 2.0 release, which has been ticking away on GitHub for a few months now. The platform reset going on in .NET right now is going to take some getting used to, but in a few years we’ll be able to thank the current ASP.NET and CoreFX teams, as well as the mass of community contributors, for the continued relevance and growth of .NET. 2016’s going to be a year for us all to rally and show some support for this work.
Seq/2 Update, Seq/2 Beta and Announcing Seq 2 RTW — It’s hard to believe Seq 2 has only been out since June. These few posts track the release of Seq 2, which was a complete UI rewrite and major overhaul of Seq v1. (2.1 followed, as did 2.2 and 2.3. Seq is now at version 2.4).
Filtering with Signals in Seq 2 — Explains the new filtering system in Seq 2.
Contender for .NET’s Prettiest Console? — If you’re not using Serilog’s Literate Console sink, you need to check out this post.
Contextual logger injection for Autofac — If you prefer to inject ILogger using your IoC container, this post is for you.
Assigning event types to Serilog events — Seq’s “event type” system can be implemented in simple flat-file logs too, for improved searching/filtering.
Aggregate Queries in Seq Part 1: Goals — The first of a series of posts documenting a spike through Seq v3’s SQL Query interface. (Parts 2, 3, 4, 5 and 6.)
How to notify Slack using logs from your .NET app — Seq + Slack = <3.

Thanks for visiting and commenting this year. Have a happy and safe holiday season, and see you in 2016!

↧

Serilog 2.0 Progress Update

February 1, 2016, 2:35 pm

≫ Next: Serilog Tip – Don’t Serialize Arbitrary Objects

≪ Previous: 2015 in Review

TL;DR: There are now 2.0.0-rc2-final-x Serilog packages on NuGet. These are primarily for the adventurers building apps for .NET Core and ASP.NET Core (previously ASP.NET 5). If you’re not pushing forwards into the next version of .NET just yet, then Serilog 1.5.x is the version you should use.

Blazing a trail, or curious to hear what we’ve been up to? Great! Read on…

Why Serilog v2?

serilog-nuget

.NET is changing – that’s possibly an understatement. The days of the big, bulky, slow-moving, preinstalled framework are drawing to a close, and in its place we’ll soon be building on a pay-for-play framework that is light enough to ship alongside your app. Targeting .NET Core means your app only has to carry with it the packages that it needs.

This spells trouble for the “batteries-included” packaging approach we took in Serilog 1.0. Including a bunch of sinks, enrichers and extensions in the core Serilog package means that the transitive closure of dependencies is large. Most apps only use one or two logging sinks, so shipping every package required by every core sink is a waste. In a year’s time, no one will want to use a logging framework that pulls in multiple megabytes of unnecessary junk.

So, fundamentally, v2 of Seriog is about packaging. Instead of the Serilog package carrying the rolling file and console sinks for example, those will now live in Serilog.Sinks.RollingFile and Serilog.Sinks.Console, respectively. Without going too crazy, other bits and pieces like <appSettings> support will also be moving out.

Since a good package factoring will mean a few breaking changes, we’ll capitalize on the opportunity to make a number of smaller improvements that couldn’t otherwise be included in 1.5.

Who is working on this?

There are dozens of active contributors to the various Serilog sub-projects. I’d love to list them here This Week in Rust-style, but I’m not organized enough to get that right. Everything that’s great about Serilog at this point is the product of a dedicated bunch of contributors, and all of that will be brought across into Serilog v2. We’ve also landed many PRs in the v2 branch that will ship along with the v2 release – thanks everyone who has sent one or more of these!

On v2 specifically though, I have to mention the work done by @khellang on the platform mechanics. .NET Core has been a moving target, and over the course of almost a year we’ve had to adjust to the twists and turns that the new platform has taken while keeping an eye on the eventual end goal. Kristian has helped the project navigate all of this and I don’t think we’d even have a working build right now if not for his guidance and hard work.

The flip-side – maintaining a usable API and intelligible packaging story, has been in large part the work of @MatthewErbs, who’s landed some of the broadest and most detailed PRs we’ve yet attempted in the last few months. Likewise, v2 would be nowhere without Matthew’s efforts – and bulletproof patience! :-)

What’s done, what’s remaining?

At the time of writing, the v2 Serilog package as well as the previously-built-in sinks work on .NET Core identically to the full .NET Framework. On the packaging side, there’s still some work to go, but things are coming together quickly.

Beyond that set of basic sinks, along with a few extras like the Seq sink and my favourite Literate Console, practically nothing works. If you are working on a .NET Core app and the sinks you prefer aren’t available, helping to update them is a great way to support the project.

We’ve also shipped Serilog.Extensions.Logging, a provider for the logging abstractions used throughout ASP.NET Core. You can read more about it and see some example output in this article on the Seq blog.

Remaining is a large amount of cleanup and polish. One task in the midst of this is removal of .NET 4.0 support from the codebase. 4.0 is a respectable platform, but it is a static one. New features and improvements going into Serilog are mostly in support of new platform features and don’t carry as much value for .NET 4 apps. Couple this with the observation that .NET apps aren’t likely to adopt new technology (including Serilog) in significant numbers, and it is hard to justify carrying forward the #if spaghetti to the new version.

This doesn’t mean Serilog is abandoning .NET 4 – only that Serilog 1.5 will remain the recommended version for this platform. Platform-specific dependency support in NuGet also means that projects wishing to actively support .NET 4 with Serilog can still do so by selecting different required Serilog versions per target framework version in project.json.

When will Serilog 2.0 ship?

The core Serilog package will ship in line with .NET Core/ASP.NET Core. There’s no solid date on this yet, but the ASP.NET Roadmap is the location to watch.

I would guess this post raises as many questions as it answers – please ask away! If you’d like to help us cross the last mile (or two) then the best way to get involved is to grab an issue by leaving a comment, or raising a new issue for work you might want to complete such as porting one of the many remaining sinks.

↧

Serilog Tip – Don’t Serialize Arbitrary Objects

February 8, 2016, 5:47 pm

≫ Next: Remote Level Control in Serilog using Seq

≪ Previous: Serilog 2.0 Progress Update

Serilog is built around the idea of structured event data, so complex structures are given first-class support.

var identity = new { Name = "Alice", Email = "alice@example.com" };
Log.Information("Started a new session for {@Identity}", identity);

If you’ve spent some time with Serilog you will have encountered the @ ‘destructuring’ operator. This tells Serilog that instead of calling ToString() on identity, the properties of the object should be serialized and stored in structured form.

{"Identity": {"Name": "Alice", "Email": "alice@example.com"}}

You might also have wondered – since Serilog is built around structured data, why isn’t serialization the default – or, why is @ required at all?

If you consider the process of serialization in general, this starts to make sense. Have you ever tried to convert an entity from EntityFramework to JSON with a general-purpose serializer? A System.Type? Give it a try! If you’re lucky enough to get a result, you’ll probably find it’s a very big blob of text indeed. Some serializers will bail out when circular references are detected, others will chew up RAM indefinitely. Most objects aren’t designed for serilalization, so they’re interconnected with many others that themselves link to yet more objects, and so-on.

Serializers – Serilog included – are not made for serializing arbitrarily-connected object graphs. Serilog has some safeguards to make sure your application survives a “mass serialization” event, but the effects on the health of your logging pipeline won’t be pretty.

In a logging library there’s a delicate balance to make between safety and runtime cost, so the best that most Serilog sinks can do is drop or reject events that are too large before trying to ship them elsewhere (the v2 version of the Seq sink now defaults to a 256 KB cap on event bodies).

What’s the TL;DR? When you serialize data as part of a Serilog event, make sure you know exactly what will be included. The @ operator is a powerful but sharp tool that needs to be used with precision.

↧

Remote Level Control in Serilog using Seq

February 15, 2016, 7:17 pm

≫ Next: Reading logger configuration from appsettings.json

≪ Previous: Serilog Tip – Don’t Serialize Arbitrary Objects

Logging is indispensable when it’s needed, but too much logging can be a drain on performance and resources when everything is going smoothly and you don’t need really fine-grained log data.

Rather than adding more machines and buying fatter pipes, the largest apps I’ve worked on have all implemented some form of dynamic level control. In this system, logging usually defaults to something sensible (my preference is Information) and can be turned up or down from a central control panel when bugs need to be shaken out.

A service bus is usually the tool of choice for implementing it: publish a message like new LoggingLevelChange { App = "Ordering", Level = "Debug" } and the participating apps will notice, say “Hey, I do ordering!” and update their logging level accordingly.

There are always a few challenges to overcome before it all works smoothly. Ideally, every app would get this capability, but it’s not always high enough on the backlog to get done early on.

Along with the next Serilog version is a small change in the Seq sink that I’m really excited about. By sharing a LoggingLevelSwitch between the logging pipeline and Seq sink, it’s now possible to control the logging levels of connected apps easily — efficiently — from the Seq server.

var levelSwitch = new LoggingLevelSwitch();

Log.Logger = new LoggerConfiguration()
.MinimumLevel.ControlledBy(levelSwitch)
.WriteTo.LiterateConsole()
.WriteTo.Seq("http://localhost:5341",
apiKey: "yeEZyL3SMcxEKUijBjN",
controlLevelSwitch: levelSwitch)
.CreateLogger();

In this configuration, levelSwitch is passed to the MinimumLevel.ControlledBy() method and the WriteTo.Seq() statement.

Now, whenever a batch of events is sent to Seq, the logging level associated with the corresponding API key will be returned by Seq (or at two minute intervals, when things are quiet). The Seq sink will apply this to controlLevelSwitch, so the whole logging pipeline will be adjusted up or down.

DynamicLevelControl

Why is this so much better than what was possible before? Seq has had per-API key level control for a while, but this was only used as a filter: log events had to be generated at the highest required level and then discarded for the scheme to work effectively, using more CPU and memory in the client application than was strictly required. Now that the Seq sink can control the level of the logging pipeline, events that aren’t required are not generated in the first place.

Pre-release versions of the Seq sink including this feature work today on .NET 4.5+, but a pre-release Serilog 2.0 version is required as well. Depending on which other Serilog packages you use, this may or may not be possible to try out in your application. Some details of Serilog 2.0 status can be found in this post.

If you’re able to check it out I’d love to hear from you.

↧