Skip to main content

Regex Lookup

Now that we have a sample to work with, we can configure the lookup.

Add the Lookup Table

important
  1. With Manage active in Stream's top nav, select the Processing submenu and click Knowledge.

    Note: Depending on the size of your window, the top nav will consolidate items that won't fit in a pulldown represented by an ellipsis (...) - if so, click on the ellipsis and then select Processing and click on Knowledge.

  2. If Lookups is not already selected in the left sidebar, click to select it.

  3. At the upper right, click Add Lookup File , then select Create with Text Editor.

  4. Copy this text to the clipboard:

    regex,sourcetype
    "^[^,]+,[^,]+,[^,]+,THREAT",firewall_threat
    "^[^,]+,[^,]+,[^,]+,TRAFFIC",firewall_traffic
  5. Paste the clipboard contents into the large text field.

  6. Type or paste firewall_sourcetypes.csv into the Filename field. Your interface should now look like this (click to enlarge): Pipelines Notice that the first column of comma-separated values contains regular expressions. As we configure this lookup, those regular expressions will be matched against the data in the _raw field.

  7. Click Save.

Create a New Pipeline

Now that the Lookup table is created, we can create a new Pipeline to apply it to our data.

important

Create a New Pipeline

  1. Select the Processing submenu and click Pipelines.
  2. Click Add Pipeline and then Create Pipeline.
  3. in the ID field, enter firewall_typing.
  4. Click Save. You now have another new, empty Pipeline.

Configure the Lookup

Next, we're ready to add and configure the Lookup Function.

important
  1. In Stream's right pane, make sure the Sample Data tab has focus.
  2. As you did with previous samples, click Simple next to the firewall.log sample. Your interface should look like this (click to enlarge): Pipelines
  3. To add the Lookup Function: In the left pane, click Add Function, then Standard, then Lookup.
  4. Configure the Function to match this screenshot (click to enlarge): Pipelines
  5. Click Save.
  6. In the right Preview pane, toggle between IN and OUT. When selecting OUT, you should see the lookup values added to each event. Pipelines

Note that:

  • We've changed Match Mode from Exact to Regex.
  • We've kept Match Type at the default setting of First Match. This is especially important with regex matching because, unlike with exact matching, the same event could match multiple regular expressions in a given lookup. With the First Match setting enabled, Stream will associate the lookup row with the first regular expression that matches a given event.
  • By default, all output fields will be associated with a matching event. As a best practice, however, we recommend that you specify output fields, in case the underlying Lookup table is modified.

Using Eval to Modify Destination Indexes

As an added bonus, for popular SIEM destinations, Cribl Stream makes it easy to modify other critical fields like host and index. In this example, we will use the Eval Function to modify each event's index, based on its new sourcetype.

In some SIEM environments, modifying a destination index would require rolling restarts on affected nodes. But if Cribl Stream is part of the data architecture, index modification can happen instantly, with no restart.

important

To add the Eval Function:

  1. In the same firewall_typing Pipeline, click Add Function, then Standard, then Eval.
  2. Scroll down to click into your new Eval Function.
  3. Under Evaluate Fields, click + Add Field.
  4. Configure the Evaluate Fields key-value pair to match this screenshot (click to enlarge): Pipelines
  5. Click Save.
  6. In the right Preview pane, toggle between IN and OUT. When selecting OUT, you should see each event's newly added index value, corresponding to its sourcetype. Pipelines

Now, you have configured variable intermingled data to sort itself into the proper sourcetype and index for downstream use.