Hide Yo Data, Hide Yo PII, 'Cause They Hacking Everyone Up in Here
Cribl Guard introduces an intelligent, scalable solution for sensitive data detection. It enhances security by protecting critical information from unauthorized access and significantly reduces the risk of data breaches. At the same time, it increases operational efficiency by automating detection, and streamlining document and data handling workflows. You can read more about Cribl Guard in the official press release. I learn by doing. Let's do.
Setup
Prior to seeing Guard in action, let's set up an environment where it can really shine.
- On the right-hand side, click
Managein theStreamsection - Click into the
defaultWorker Group - Up top, click
Data > Sources - Locate and click into the
DatagenSource - At the top right, click
Add Source - Fill out the fields as follows:
- Input ID:
business_event - Data Generator File Name:
businessevent.log
- Input ID:
Commit & Deploy
If you are new to Cribl, welcome! Due to the nature of our environment (utilizing a Cribl.Cloud-hosted Leader), our changes are not put into effect until we Commit & Deploy them to the Workers int eh Worker Groups. Like this:
- In the top right, click
Commit & Deploy - In the resulting modal, write some meaningful message:
The journey was the friends we made along the way - Click
Commit & Deployat the bottom right of the modal
From now on, we'll just refer to this as Commit & Deploy. Thanks for coming to my Ted Talk™.
Pushing the Datagen out might take a minute or two. When it's ready, we'll collect a sample to use in the rest of the sandbox.
- To the right of your
business_eventSource, clickLive
It can be pretty easy to just follow directions and think nothing of it. But this is actually kind of a cool feature and I want you to appreciate it. So let's take a quick look at the sample you just obtained. It can be tough to see because all the data is in the _raw field at the moment, but if you expand it by clicking Show More you'll see some Key=Value Pairs. Of note, literally, are the social and accountNumber fields. These are potentially PII and we most likely don't want them floating around in our logs / Destinations. Cribl Guard will help us with that.
- Once the sample finishes collecting (which should be rather quick since we only wait for 10 events and our Workers are sending 10 events per second), click
Save as Sample Fileat the bottom right - Change the filename to
be.log - Click
Save
On Guard
We can sit and talk read all day, but as we learned in highschool english "show, don't tell". Let's go look at the fancy intro pages for Guard and then actually do some shit.
- On the left nav, click
Guard - [Optional] (we said no more reading) Click through (and read) the three tabs illustrating Cribl Guard's capabilities
- Guard your Destinations
- Detect sensitive data with AI
- Monitor your Pipelines
- Once finished, click
Get Started
Now I wouldn't call this too "real world", but we can see Guard in action by enabling it on our devnull Destination, since our Datagen will be sending through to devnull. So let's do that.
- Click the radio button left of
devnullto enable Guard. Commit & Deploy
We need to go deeper. We need to see it in action.
- To the right of
devnull, clickguard_devnull_pipe` to go into the automatically generated Cribl Guard pipeline - On the right, click
Simpleto the right ofbe.log - In the top left of the right-hand side, click
OUTto see the results of the pipeline - On the first event at the right, click
Show Moreat the end of_raw
What the heck are we looking at? Good question. Let's start from the beginning. In the beginning God created the heavens and the earth. That Datagen we created runs of a sample business event (go figure) which has some sensitive information in it. If you look through the events you'll see fields titled socialor accountNumber. Cribl Guard automatically scans for Personally Identifiable Information (PII) and masks that data. Looking at those fields now, we can see that they have been replaced with REDACTED. Neat!
This is just the default behavior for Cribl Guard. We can go deeper! You'll notice that there is another field in this _raw called cardNumber which probably refers to credit card information. Definitely PII. Let's mask that.
- On the left, click the
1in the circle to expand the Guard function - Click
Add Ruleset - For
Ruleset ID, selectFinance_Global - Click
Save.
Well that was quick! We can already see that cardNumber is REDACTED. That's hot. But we can go deeper!
- Up top, click
Processing > Knowledge - On the left list click
Guard Rules - At the top right, click
Add Rule > Add Rule with Copilot - In the resulting chat window click
Collect Sample Data - Click
Select and existing sample file - Select
be.logfrom the dropdown and clickConfirm - Click
Describe the data you would like to mask - At the bottom, type
userName - Check the automatic work and be amazed
I could walk you through putting that rule into your Guard Pipeline for your devnull Destination, but I think you can handle that 😘. Let's wrap up.