Skip to main content

Knowledge: Better than a Lamborghini

TL;DR

The Knowledge section is where Stream stores Parsing and Lookup libraries that gets called elsewhere, such as:

  • Common regular expressions to find:
    • IP addresses
    • Credit card numbers
    • Phone numbers Grok Patterns to find similar information to the regexes above
  • Event breaker rules to help parse incoming data prior to Routes
  • User-generated lookup files for use in data enrichment
  • Global variables for use across all Functions
  • JSON & Parquet Schemas for validation
  • Database connections
  • Appscope Configurations

Stream comes with a lot of Knowledge objects by default, but users can also add their own!

Knowledge is Stream’s library of Patterns, Expressions, and Lookups. Some of these come with Stream by default, such as the extensive library of Grok patterns and Regular Expressions. Others can be added by a user as needed. For example when you want to enrich your firewall data by adding a new field that denotes the IT-designated name that is associated with the traffic’s subnet (hint: we cover this in a later course).

Let’s make a couple quick clicks and be on our way.

important
  1. Select the Processing submenu and click Knowledge
  2. Click security-cidr-lookup.csv to open the lookup file

This file is called from a Pipeline called breachlookup. You can go check it out if you want, but in case you’re short on time, here is a brief explanation:

  1. Palo Alto logs hit the Pipeline breachlookup
  2. A Regular Expression extracts a few key pieces of information (similar to the Pipeline we just looked at)
  3. One of the extracted fields is a source IP address (i.e. what IP address or computer sent this)
  4. The Pipeline calls this file security-cidr-lookup.csv and tries to match the subnet for the source IP address
  5. Once matched, the function in the Pipeline grabs the corresponding value in the next column (location) and puts it into a new field in the event

What this looks like is:

  • Prior to the Pipeline, there is a field labeled src_ip
  • After the Pipeline, the original src_ip field is there, but there is a second field labeled src_subnet_usage with a corresponding value such as Home Worker Nets

Now when an admin is searching the firewall logs in their SIEM, they don’t need to remember that people working from home use 192.168.23.0/24. Instead they can search for Home Worker Nets and get the same results. Fun!

If you’re interested, feel free to click around in the Grok patterns and Regular Expressions as well. These are definitions that can be called in a Pipeline by typing their name. That way, you don’t have to remember the Regex to grab an IP address out of random data, you can simply type ip address and the Regex will be automatically filled in for you.

Let’s discuss Packs.