Pipelines: Sprinkle Lookups with Regex Magic
We’ve seen examples of using the magical powers of regex to customize Functions, extract fields, and filter events in real time. In this section, we’ll show you how to sprinkle your Lookups with regex magic. Let's walk through a Pipeline that demonstrates four different ways to leverage regular expressions in Cribl Stream.
Step 1: Extract the data with regex
When organizations use host-naming standards, it's easy to understand things like regions, Availability Zones (AZs), IP addresses, and more. For example, consider an Amazon host called:
ec2-35-162-133-145.us-west1-a.compute.amazonaws.com
This is an EC2 host with a (dashed) IP address 35-162-133-145, in the us-west1 region, in Availability Zone a. You can also see the domain:
compute.amazonaws.com.
While we can understand the enriched host names, we don't know which indexes to route the data to, nor which sourcetypes to assign to the events, without looking up this information from another source. Doing so is often a huge challenge for organizations. To solve this challenge, let's combine the Regex Extract, Lookup, and Eval Functions with some sample events to demonstrate the power of Cribl Stream.
- In the Stream UI's top nav, make sure Manage is active.
- From the submenu, select Processing > Pipelines.
- On the Pipelines page, find and click the
setting_index_by_region_availability_zonePipeline. (To display this first column's header and contents, you might need to drag the pane and column dividers toward the right.) - In the Pipeline's right pane, make sure Sample Data is selected.
- Click the Simple link at the lower right beside the
lookupsample.logfile. - At the end of any event's
_rawfield, click the Show more link to view all the fields in the event. - Click Add Function near the top of the left pane, and either find Regex Extract in the Standard submenu, or type Regex Extract into the search box to locate it. Then click it to add this Function to the Pipeline.
- Leave Filter at its default
truevalue. - Enter a simple Description for the Function.
- In the Regex field, paste: the following:
GMT:\s+(?<host>[^.]+)\.(?<region>\w+-\w+\d+)-(?<az>[^.]+)\.(?<domain>[^:]+) - Leave Source_Field at its default
_rawvalue. - Click Save.
- Click the OUT button near the top of the right Preview pane to see the transformation of the data. The extracted fields
az,domain,host, andregionnow appear below the_rawevent. You can use these extracted fields for searching in your preferred search solution.
Step 2: Assign an index and sourcetype using Lookups
We still need to determine the index and sourcetype. Cribl Stream's Lookup Function enriches events with external fields. We'll use it with the newly extracted region field to assign an index and sourcetype to these events.
In the table below, five simple regular expressions map the extracted region field to the appropriate index and sourcetype. For example, the region us-west1-a starts with us, so it matches the first regular expression: us.+
We use this Lookup table's first row to assign an index of usa_index_tier, and a sourcetype of cloud-init, to each matching event. The region patterns in the table's four remaining rows work the same way.

- Still in the same
setting_index_by_region_availability_zonePipeline, click + Function near the top of the left pane, and either find Lookup in the Standard submenu, or type Lookup into the search box to locate it. Then click Lookup to add this Function to the Pipeline. - Leave Filter at its default
truevalue. - Enter a simple Description for the Function.
- In the Lookup file path drop-down, select
region_index_sourcetype.csv. - For the Source_Field, leave the default
_raw. - For Match mode, select
Regex. - For Match type, select
Most Specific. - For Lookup field name in event, type
Region.
Since we did not specify any Output fields, the Function will default to outputting all fields in the Lookup table. In our case we get the fields: index and sourcetype.
Step 3: Get the host IP address from Hostname
Since the IP address is present in the host field, we can create the host_ip field using an Eval Function with this replace method:
host.replace(/\w+-(\d+)-(\d+)-(\d+)-(\d+)/,'$1.$2.$3.$4')
This regular expression uses capture groups, and pulls the four IP octets present in the hostname to build the host_ip. These four capture groups are notated as $1.$2.$3.$4, respectively. This method is very fast, and it removes the need to perform a DNS lookup from the host field to get the host's IP address. Need we say it? Magic!
- Still in the same
setting_index_by_region_availability_zone, click + Function near the top of the left pane, and either find Eval in the Standard submenu, or type Eval into the search box to locate it. Then click Eval to add this Function to the Pipeline. - Leave Filter at its default
truevalue. - Enter a simple Description for the Function.
- Click Add Field under Evaluate Fields, and then enter
host_ipunder Name. - For the Value Expression, paste the following:
host.replace(/\w+-(\d+)-(\d+)-(\d+)-(\d+)/,'$1.$2.$3.$4')
- Click Save.
With the Lookup Function added to our Pipeline, the Preview pane's OUT tab shows that the index and sourcetype are now added to each event.
Step 4: Customize the Sourcetype
Finally, let's put some sense into the sourcetype field, using another Eval Function. By combining the values of the ${sourcetype}_${region}_${az}, the sourcetype becomes cloud-init_us-west1_a –. Now you can understand much more about the sourcetype at a glance.
Examine this Eval Function's value expression, taking careful note of the backticks ( ) and braces () that surround the field names, and the underscore (_) that separates them.
- Still in the same
setting_index_by_region_availability_zonePipeline, click Add Function near the top of the left pane, and either find Eval in the Standard submenu, or type Eval into the search box to locate it. Then click Eval to add this Function to the Pipeline. - Leave Filter at its default
truevalue. - Enter a simple Description for the Function.
- Click Add Field under Evaluate Fields, and then enter
host_ipunder Name. - For the Value Expression, paste the following:
${sourcetype}_${region}_${az}
- Click Save.
Take a look at the updated sourcetypes in the Preview pane's OUT tab. Congratulations, you have accomplished quite a bit of complicated magic in this section, which sadly brings us to the end of our magical odyssey.