Skip to main content

Aggregating the Data

To create our metrics, we're going to create a new Pipeline, add a couple of Functions to it, and then – in our collector – replace the passthru Pipeline reference with the new Pipeline.

important
  1. Click the Cribl upper tab.
  2. In Stream's top menu, with Manage active, select Processing and click Pipelines.
  3. Click Add Pipeline and select Create Pipeline from the resulting drop-down.
  4. In the ID field, enter firewall_metrics, and click Save.

This should create the Pipeline and put you into the Pipeline editing page for the new Pipeline, like this: Empty Pipeline

Our Kibana Dashboard expects aggregated data, grouped by source IP address and destination IP address. We need to do a few things to get our data into proper shape to fill out that Dashboard.

important
  1. In the right pane, click the Simple link to the right of the sample file we saved during preview (collected-events.log).
  2. Once the events are displayed in the right Preview pane, click the OUT button so that you'll see the Pipeline's transformations of the data.
  3. Click Add Function and select Standard > Numerify (or just type Numerify into the search box), and then click Save.

At this point, you should see a number of fields in the sample on the right change color, as shown in the example below. The Numerify Function looks through the event, and converts any values that contain only numeric data to numbers. This will allow us to do aggregations on those fields. Numerify Output

Next, we'll add another Function to do the aggregations. Our Dashboard has two values (count of sessions and total data transferred), grouped by two fields (src_ip and dest_ip).

important
  1. Click Add Function and select Aggregations from the Standard menu.
  2. In the Aggregates field, enter count().
  3. In Group by Fields, enter both src_ip and dest_ip. (Press your Enter or Return key between these two field names.)
  4. Click Save.

Now, the sample data on the right will largely get struck out. But if you scroll down to about record 101, you'll see a new structure that's being created, like this: Count Results

The original records are effectively being dropped after aggregations are run, and the only records that will now make it out of the Pipeline will be the new aggregated ones.

Since this data has high cardinality, this does not reduce the number of records. But if you click on the Pipeline diagnostics icon (next to Select Fields, towards the top of the right pane), you'll see that replacing the original data with aggregations reduces the amount of data you'll be sending to the destination by somewhere around 85%.

Basic Stats Results

However, we still need another metric aggregated: the sum of the bytes.

important
  1. In the Aggregations Function's Aggregates section, click Add Aggregate.
  2. In the new row that appears, enter sum(bytes) and click Save.

Again, looking at the sample results, you should now see all of the aggregated records add a bytes_sum field. We're almost ready to start feeding out data to our Dashboard. But first, we need to set the sourcetype to what Kibana is expecting, firewall_metrics.

important
  1. In the Aggregations Function, this time click Add Field in the Evaluate Fields section.
  2. In the Name Field, enter sourcetype.
  3. In the Value Expression field, enter 'firewall_metrics'. (Include the single quotes.)
  4. Click Save. The screen should look something like this: Aggregate Function Config

You should now also see the new sourcetype field show up in the aggregated events at right.