Cross the Streams

Cribl Stream's built-in integration with Cribl Lake enables users to more easily get data where it's going. Now, instead of working on getting permissions and figuring out partitioning expressions and getting bogged down with endless meetings, you can get shit done, quickly. Let's start by getting some data flowing through our Stream into our Lake (heh). We'll use Stream's built-in Datagen Source to simulate some Apache web logs data.

The ~~Spice~~ Data must flow

From the product switcher at the upper left, click Stream
Click into the default Worker Group
With the Manage tab active, click into Routing > QuickConnect
Click Add Source at the top left
Select Datagen and click Add New
Set up your new Source by filling in the following information:
- Input ID: sbx_apache
- Data Generator File – add two:
  - apache_common.log with Events Per Second Per Worker Node set to 1
  - apache_error.log with Events Per Second Per Worker Node set to 1
Click Save

In case you didn't know, Cribl Stream can similarly generate data based on logs, metrics, and traces that you upload as files. This makes it easy to test out Stream Pipelines without trying to reproduce your full production environment.

Anyway, let's configure our Lake as a Destination and then add in some fun routing for a later thing (send).

Plumb it like it's hot

Still on Routing > QuickConnect, click Add Destination at the top right
Select Cribl Lake and click Add New
Configure your new Lake Destination by filling in the following information:
- Output ID: sbx_apache
- Lake dataset: default_logs
Click Save
Click the + on the sbx_apache Datagen Source on the left, drag a connection over to the sbx_apache Cribl Lake Destination on the right, and release
In the resulting pop-up, leave Passthru selected and click Save

No AWS or IT or Security team needed to be consulted. No ticket to be opened and the resulting weeklong waits and myriad meetings. We just set up incoming data to send to our Cribl Lake. Well, it's technically not sending yet, we need to Commit & Deploy, but first let's add a (literal) feedback loop. Don't worry, we'll explain.

Feedback Loop

Still on Routing > QuickConnect, click Add Source on the left
Select Cribl HTTP and click Select Existing
Click the in_cribl_http Source
In the resulting pop-up click Yes
Click Add Destination at the top right
Select Cribl Lake and click Add New
Configure your Source by filling in the following information:
- Output ID: sbx_incident_response
- Lake dataset: sbx_incident_response
Click Save
Click the + on the in_cribl_http Source on the left, drag a connection over to the sbx_incident_response Cribl Lake Destination on the right, and release
Expand your Destinations
Can't see your sbx_incident_response Lake Destination? You may need to click the Cribl Lake tile to expand and show both Lake Destinations we've configured.
In the resulting pop-up, leave Passthru selected and click Save

OK, what did we just do? Well, later on we're going to explore using Cribl Search's send operator, which allows users to send query results to Cribl Stream through the in_cribl_http Source. In this instance, we'll use this Source as a feedback loop where we send the data to a "new" dataset in order to expedite our searches. Fun fact: there's a better way to do that, which we will also do 😉. The reason we are configuring this connection, however, is so that you can see how simple it would be to replace the sbx_incident_response destination with, say, your SIEM of choice.

Time to push this configuration to our Workers so that we can see the fruits of our ~~loom~~ labour.

Commit & Deploy

At the top right, click the blue Commit & Deploy button

Enter a commit message that reflects the hard work we've done (example below)

sbx_lake configuration
- added datagen for apache logs and errors
- added Cribl Lake destinations for default_logs and sbx_incident_response
- connected in_cribl_http to Cribl Lake sbx_incident_response

Click Commit & Deploy

Replay and Me

Cribl Lake also simplifies Cribl Stream's built-in Replay capability. If you've configured Replay before, you'll know it's already pretty straightforward (permissions be damned!); however, Cribl Lake makes it even easier, with a dedicated Stream Collector.

Be kind, replay

Click Data > Sources in the top menu
Under Collectors, click Cribl Lake
Collectors???
If you don't see a Collectors section in the Sources, check that you have disabled ad blockers. Turns out they "collect" information, so this word is often just outright blocked. LOL.
Click Add Collector at the top right
Observe that you only need two pieces of information to configure a Cribl Lake Replay: Collector ID and Lake dataset
When you're done, close out of this modal by clicking Cancel

Time to go check out Cribl Search & Lake. ~~McCormick~~ Cribl, 'It's gonna be great!'

Replay and Me​

Replay and Me