Hide Yo Data, Hide Yo PII, 'Cause They Hacking Everyone Up in Here
Cribl Guard introduces an intelligent, scalable solution for sensitive data detection. It enhances security by protecting critical information from unauthorized access and significantly reduces the risk of data breaches. At the same time, it increases operational efficiency by automating detection, and streamlining document and data handling workflows. You can read more about Cribl Guard in the official press release. I learn by doing. Let's do.
Setup
Prior to seeing Guard in action, let's set up an environment where it can really shine.
- On the right-hand side, click
Manage
in theStream
section - Click into the
default
Worker Group - Up top, click
Data > Sources
- Locate and click into the
Datagen
Source - At the top right, click
Add Source
- Fill out the fields as follows:
- Input ID:
business_event
- Data Generator File Name:
businessevent.log
- Input ID:
Commit & Deploy
If you are new to Cribl, welcome! Due to the nature of our environment (utilizing a Cribl.Cloud-hosted Leader), our changes are not put into effect until we Commit & Deploy
them to the Workers int eh Worker Groups. Like this:
- In the top right, click
Commit & Deploy
- In the resulting modal, write some meaningful message:
The journey was the friends we made along the way
- Click
Commit & Deploy
at the bottom right of the modal
From now on, we'll just refer to this as Commit & Deploy
. Thanks for coming to my Ted Talk™.
Pushing the Datagen out might take a minute or two. When it's ready, we'll collect a sample to use in the rest of the sandbox.
- To the right of your
business_event
Source, clickLive
It can be pretty easy to just follow directions and think nothing of it. But this is actually kind of a cool feature and I want you to appreciate it. So let's take a quick look at the sample you just obtained. It can be tough to see because all the data is in the _raw
field at the moment, but if you expand it by clicking Show More
you'll see some Key=Value Pairs. Of note, literally, are the social
and accountNumber
fields. These are potentially PII and we most likely don't want them floating around in our logs / Destinations. Cribl Guard will help us with that.
- Once the sample finishes collecting (which should be rather quick since we only wait for 10 events and our Workers are sending 10 events per second), click
Save as Sample File
at the bottom right - Change the filename to
be.log
- Click
Save
On Guard
We can sit and talk read all day, but as we learned in highschool english "show, don't tell". Let's go look at the fancy intro pages for Guard and then actually do some shit.
- On the left nav, click
Guard
- [Optional] (we said no more reading) Click through (and read) the three tabs illustrating Cribl Guard's capabilities
- Guard your Destinations
- Detect sensitive data with AI
- Monitor your Pipelines
- Once finished, click
Get Started
Now I wouldn't call this too "real world", but we can see Guard in action by enabling it on our devnull
Destination, since our Datagen will be sending through to devnull
. So let's do that.
- Click the radio button left of
devnull
to enable Guard. Commit & Deploy
We need to go deeper. We need to see it in action.
- To the right of
devnull, click
guard_devnull_pipe` to go into the automatically generated Cribl Guard pipeline - On the right, click
Simple
to the right ofbe.log
- In the top left of the right-hand side, click
OUT
to see the results of the pipeline - On the first event at the right, click
Show More
at the end of_raw
What the heck are we looking at? Good question. Let's start from the beginning. In the beginning God created the heavens and the earth. That Datagen we created runs of a sample business event (go figure) which has some sensitive information in it. If you look through the events you'll see fields titled social
or accountNumber
. Cribl Guard automatically scans for Personally Identifiable Information (PII) and masks that data. Looking at those fields now, we can see that they have been replaced with REDACTED
. Neat!
This is just the default behavior for Cribl Guard. We can go deeper! You'll notice that there is another field in this _raw
called cardNumber
which probably refers to credit card information. Definitely PII. Let's mask that.
- On the left, click the
1
in the circle to expand the Guard function - Click
Add Ruleset
- For
Ruleset ID
, selectFinance_Global
- Click
Save
.
Well that was quick! We can already see that cardNumber
is REDACTED
. That's hot. But we can go deeper!
- Up top, click
Processing > Knowledge
- On the left list click
Guard Rules
- At the top right, click
Add Rule > Add Rule with Copilot
- In the resulting chat window click
Collect Sample Data
- Click
Select and existing sample file
- Select
be.log
from the dropdown and clickConfirm
- Click
Describe the data you would like to mask
- At the bottom, type
userName
- Check the automatic work and be amazed
I could walk you through putting that rule into your Guard Pipeline for your devnull
Destination, but I think you can handle that 😘. Let's wrap up.