Skip to main content

The Pipeline

Now we're ready to start using a few of Cribl Stream's built-in Functions to enrich this data.

A Quick Tour of Pipelines

First, let's show you around the Pipelines interface. Each Pipeline is a collection of Functions that you can associate with a specified Route, Source, or Destination.

important
  1. From the last section, you should already have Manage > Processing > Pipelines selected, and Preview Simple selected in the right pane. In this sandbox environment (only), the left Pipelines pane might look a little cluttered.
  2. Drag pane and column selectors to the right, as needed, to reveal the Pipeline column's contents in the left pane. Your interface should now look something like this (click to enlarge): pipelines

Pipelines

The Pipelines page lists all available Pipelines. You can search them or filter them based on their associations. Processing Pipelines are associated with Data Routes; pre-processing Pipelines are associated with Sources; and post-processing Pipelines are associated with Destinations. (Those associations are configured in the Data Routes, Sources, and Destinations interfaces, respectively.) You can also use the Show drop-down to filter by status: All, In Use, or Not In Use.

Preview

The right pane's Preview interface enables you to manipulate sample data. You can use this interface to see the results of the changes you make to a Pipeline before you commit those changes to live streams of data.

On the right pane's Sample Data tab, we've already used the Capture Data button. In other parts of this tutorial, we will experiment with the Paste [clipboard data] and Attach [sample file] options. For now, we will work with the sample we captured in the last step.

important
  1. If the right pane's Sample Data tab doesn't have focus, click it.
  2. Click the Simple link next to the apache_common.log file.
  3. Click any Pipeline in the left pane.
  4. To see how the selected Pipeline affects sample data, you can toggle between the IN and OUT buttons at the top of the Preview pane's Preview Simple tab. But we're about to make the OUT view more interesting, by creating a new Pipeline and then populating it with some Functions to refine our data.

Create a New Pipeline

Now that you're familiar with the interface, let's create our first Pipeline. For this Pipeline, we will add three Functions:

  • The Regex Extract Function, to extract the status code (and some other fields) from each event.
  • The Lookup Function, to associate that status code with the correct description and type.
  • The Sample Function, to sample a subset of events of type Succesful, to reduce overall data volume.
important
  1. If the Manage > Processing > Pipelines page dosn't have focus, select the Processing submenu and click Pipelines.
  2. Click Add Pipeline in the left pane's top-right corner, and then click Create Pipeline.
  3. In the ID field, enter access_common_lookup.
  4. Click Save. You've created an empty Pipeline, which we'll now populate with Functions.
  5. Click Add Function at the upper right, click the Standard submenu, and then click to select Regex Extract. (You can also select a Function by typing the first few letters of its name into the drop-down's Search box.) Once you have selected Regex Extract, it will appear in your Pipeline.
  6. In the Regex Extract Function, paste the following string into the Regex field:
    ^(?<__clientIP>[\d.]+) (?<__ident>\S+) (?<__httpBasicUser>\S+) (?<__timestamp>\[[^\]]+\]) "(?<method>\w+) (?<uri>[^?]+)\" (?<status>\d{3}) (?<bytes>\d+)
  7. If Source Field does not already contain _raw, type that in. Your interface should now look like this (click to enlarge): new pipeline
  8. Click Save.
  9. Confirm these changes in the Preview by toggling between IN and OUT. When viewing OUT, you should see the new fields that have been created by the Regex Extract Function highlighted in green.
    new Pipeline You can also toggle between the Event and Table buttons shown below, to confirm (for example) that the status field has been extracted properly in every event. Event &amp; Table buttons

Now that we've reviewed the Pipelines interface and started our first Pipeline, let's enrich this data with a lookup.