Skip to main content

Captures & Expressions

This section will describe some high-level concepts used throughout Stream. After this section, you will have run some captures to show events moving through the Pipeline, and will have experimented with different Filter Expressions.

First, let's talk about events. In Cribl Stream, we see the world in terms of events. Events are a rag bag of key-value pairs, which can come in a variety of sizes and shapes.

Schema-Agnostic

Cribl Stream is a schema-agnostic processing system, in that we do not prescribe a particular schema, and can work with events in any shape. The shape of an event is largely dictated by the protocol in which we receive events. Below, you can see examples of a popular SIEM event, Elastic event, syslog event, and Metric event. Each of these has a different set of key-value pairs, depending on the protocol.

Data shaped for Popular SIEMsData shaped for Elastic
Data shaped for syslogThe shape of a metric

Events can be a log line, a metric measurement, or any arbitrary set of key-value pairs. Events usually have a timestamp. In log systems, events often have a nested schema in a field like _raw or message. These events might be in a number of different serialization formats, like JSON, Key=Value, logfmt, or CEF, and many more. Stream is agnostic to both the schema of the event and the contents of any field (which may have its own schema).

Captures

You can see this yourself interactively by using a capture in Stream.

Run a Capture
  1. Select the Routing submenu and click Data Routes.
  2. Make sure the right pane's Sample Data tab has focus.
  3. Click Capture Data.
  4. Replace the Filter Expression field's default true entry by pasting in the following expression:
    __inputId.startsWith('datagen')
  5. Click Capture and then Start.

Captures reach into the running Pipeline to bring back events to work with in the UI. This capture will bring back events for which the above Filter Expression returns true. This will run for 10 seconds, or bring back 10 events – whichever happens first.

__inputId is an internal field containing the ID of the input that delivered the data. .startsWith('datagen') uses a JavaScript string function, startsWith, to find input IDs that start with datagen. So, what has been returned are 10 events which came in from our __inputId (datagen), and we can see exactly what those events look like. These have a popular SIEM schema, with fields like _raw, _time, index, host, source, and sourcetype.

Filter Expressions

Filter expressions are used throughout the product. Filter expressions are single-line JavaScript expressions. Filter expressions have full access to the contents of the event. Fields starting with __ are internal fields, and contain information Cribl is maintaining about the event.

Show Internal Fields
  1. In the Capture window, click the ... (Advanced Settings) menu.
  2. Toggle Show Internal Fields on to see all fields.

Everything you see in the event is available as a field in an expression. If you wanted to find events from this dataset that contain iPhone, you could run a regular expression against the _raw field.

Run Regex as Filter Expression
  1. Clear the Filter Expression field, and try typing in the expression below rather than pasting:
    _raw.match(/iPhone/)
  2. As you're typing, typeahead shows you interactively what options are available.
  3. Click Capture and click Start.
    Randomly, depending on the data, you might have to run this more than once to get results. You might also have to click a _raw field's Show more link to see the iPhone strings.
  4. Click Cancel when you're done.

Now we see only events which have the text iPhone.

Next, we're going to look into two core concepts of Cribl, Data Routes and Pipelines.