Cross the Streams
Cribl Stream's built-in integration with Cribl Lake enables users to more easily get data where it's going. Now, instead of working on getting permissions and figuring out partitioning expressions and getting bogged down with endless meetings, you can get shit done, quickly. Let's start by getting some data flowing through our Stream into our Lake (heh). We'll use Stream's built-in Datagen Source to simulate some Apache web logs data.
- From the product switcher at the upper left, click
Stream
- Click into the
default
Worker Group - With the
Manage
tab active, click intoRouting > QuickConnect
- Click
Add Source
at the top left - Select
Datagen
and clickAdd New
- Set up your new Source by filling in the following information:
- Input ID:
sbx_apache
- Data Generator File – add two:
apache_common.log
with Events Per Second Per Worker Node set to1
apache_error.log
with Events Per Second Per Worker Node set to1
- Input ID:
- Click
Save
In case you didn't know, Cribl Stream can similarly generate data based on logs, metrics, and traces that you upload as files. This makes it easy to test out Stream Pipelines without trying to reproduce your full production environment.
Anyway, let's configure our Lake as a Destination and then add in some fun routing for a later thing (send
).
- Still on
Routing > QuickConnect
, clickAdd Destination
at the top right - Select
Cribl Lake
and clickAdd New
- Configure your new Lake Destination by filling in the following information:
- Output ID:
sbx_apache
- Lake dataset:
default_logs
- Output ID:
- Click
Save
- Click the
+
on thesbx_apache
Datagen Source on the left, drag a connection over to thesbx_apache
Cribl Lake Destination on the right, and release - In the resulting pop-up, leave
Passthru
selected and clickSave
No AWS or IT or Security team needed to be consulted. No ticket to be opened and the resulting weeklong waits and myriad meetings. We just set up incoming data to send to our Cribl Lake. Well, it's technically not sending yet, we need to Commit & Deploy
, but first let's add a (literal) feedback loop. Don't worry, we'll explain.
- Still on
Routing > QuickConnect
, clickAdd Source
on the left - Select
Cribl HTTP
and clickSelect Existing
- Click the
in_cribl_http
Source - In the resulting pop-up click
Yes
- Click
Add Destination
at the top right - Select
Cribl Lake
and clickAdd New
- Configure your Source by filling in the following information:
- Output ID:
sbx_incident_response
- Lake dataset:
sbx_incident_response
- Output ID:
- Click
Save
- Click the
+
on thein_cribl_http
Source on the left, drag a connection over to thesbx_incident_response
Cribl Lake Destination on the right, and releaseExpand your DestinationsCan't see your
sbx_incident_response
Lake Destination? You may need to click theCribl Lake
tile to expand and show both Lake Destinations we've configured. - In the resulting pop-up, leave
Passthru
selected and clickSave
OK, what did we just do? Well, later on we're going to explore using Cribl Search's send
operator, which allows users to send query results to Cribl Stream through the in_cribl_http
Source. In this instance, we'll use this Source as a feedback loop where we send the data to a "new" dataset in order to expedite our searches. Fun fact: there's a better way to do that, which we will also do 😉. The reason we are configuring this connection, however, is so that you can see how simple it would be to replace the sbx_incident_response
destination with, say, your SIEM of choice.
Time to push this configuration to our Workers so that we can see the fruits of our loom labour.
Commit & Deploy
- At the top right, click the blue
Commit & Deploy
button - Enter a commit message that reflects the hard work we've done (example below)
sbx_lake configuration
- added datagen for apache logs and errors
- added Cribl Lake destinations for default_logs and sbx_incident_response
- connected in_cribl_http to Cribl Lake sbx_incident_response - Click
Commit & Deploy
Replay and Me
Cribl Lake also simplifies Cribl Stream's built-in Replay capability. If you've configured Replay before, you'll know it's already pretty straightforward (permissions be damned!); however, Cribl Lake makes it even easier, with a dedicated Stream Collector.
- Click
Data > Sources
in the top menu - Under
Collectors
, clickCribl Lake
Collectors???If you don't see a
Collectors
section in the Sources, check that you have disabled ad blockers. Turns out they "collect" information, so this word is often just outright blocked. LOL. - Click
Add Collector
at the top right - Observe that you only need two pieces of information to configure a Cribl Lake Replay: Collector ID and Lake dataset
- When you're done, close out of this modal by clicking
Cancel
Time to go check out Cribl Search & Lake. McCormick Cribl, 'It's gonna be great!'