Knowledge: Better than a Lamborghini
The Knowledge section is where Stream stores Parsing and Lookup libraries that gets called elsewhere, such as:
- Common regular expressions to find:
- IP addresses
- Credit card numbers
- Phone numbers Grok Patterns to find similar information to the regexes above
- Event breaker rules to help parse incoming data prior to Routes
- User-generated lookup files for use in data enrichment
- Global variables for use across all Functions
- JSON & Parquet Schemas for validation
- Database connections
- Appscope Configurations
Stream comes with a lot of Knowledge objects by default, but users can also add their own!
Knowledge is Stream’s library of Patterns, Expressions, and Lookups. Some of these come with Stream by default, such as the extensive library of Grok patterns and Regular Expressions. Others can be added by a user as needed. For example when you want to enrich your firewall data by adding a new field that denotes the IT-designated name that is associated with the traffic’s subnet (hint: we cover this in a later course).
Let’s make a couple quick clicks and be on our way.
- Select the
Processing
submenu and clickKnowledge
- Click
security-cidr-lookup.csv
to open the lookup file
This file is called from a Pipeline called breachlookup
. You can go check it out if you want, but in case you’re short on time, here is a brief explanation:
- Palo Alto logs hit the Pipeline
breachlookup
- A Regular Expression extracts a few key pieces of information (similar to the Pipeline we just looked at)
- One of the extracted fields is a source IP address (i.e. what IP address or computer sent this)
- The Pipeline calls this file
security-cidr-lookup.csv
and tries to match the subnet for the source IP address - Once matched, the function in the Pipeline grabs the corresponding value in the next column (
location
) and puts it into a new field in the event
What this looks like is:
- Prior to the Pipeline, there is a field labeled
src_ip
- After the Pipeline, the original
src_ip
field is there, but there is a second field labeledsrc_subnet_usage
with a corresponding value such asHome Worker Nets
Now when an admin is searching the firewall logs in their SIEM, they don’t need to remember that people working from home use 192.168.23.0/24
. Instead they can search for Home Worker Nets
and get the same results. Fun!
If you’re interested, feel free to click around in the Grok patterns and Regular Expressions as well. These are definitions that can be called in a Pipeline by typing their name. That way, you don’t have to remember the Regex to grab an IP address out of random data, you can simply type ip address
and the Regex will be automatically filled in for you.
Let’s discuss Packs.