Welcome to Data Tiering & Search Optimization

TL;DR

Learn how to build a cost-effective data architecture that enables forensic investigations without exploding your budget. You'll create a multi-tier storage strategy and master search techniques that help you find needles in haystacks—then send only the relevant data downstream.

What You'll Learn

By the end of this sandbox, you'll be able to:

Search across massive datasets to find specific events quickly
Tier data automatically based on business value and criticality
Export only relevant events to expensive downstream systems
Optimize search queries for speed and cost efficiency
Tag events with metadata that drives intelligent routing decisions

The Challenge

Your security team just informed you that they need to investigate a potential breach that happened 4 months ago. They need to:

Search through 500TB of historical data in Cribl Lake
Find evidence of suspicious activity across multiple log sources
Export relevant events to your SIEM for deep analysis
Do all this without blowing your quarterly budget

Traditional approach: Export all 500TB to your SIEM. Cost: $$$K. Time: 3-5 days.

The Cribl way: Search Lake directly, find the needle, send only what matters. Cost: Less than $100. Time: Minutes.

The Solution: Smart Tiering

Not all data is created equal. Some events are critical and need immediate attention in your top-tier SIEM. Others are important for context but can live in cheaper storage. The rest? Archive for compliance.

Think of it like coffee beans at Cribl Coffee Co.:

Tier 1 (Hot): Premium single-origin beans → expensive SIEM
Tier 2 (Warm): Quality house blend → cost-effective Lakehouse
Tier 3 (Cold): Bulk commodity → cheap long-term Lake storage

We'll teach you how to automatically route events to the right tier based on their business impact.

What You'll Learn

This isn't a theoretical exercise. You'll work with two real, pre-built systems:

1. Interactive Forensic Dashboard (Tips & Tricks)

We've built dashboards in a Search pack that let you:

Search across all datasets for specific terms or IPs
Filter results interactively
Send only matching events to your data lake
Perfect for incident response and investigations

You'll learn how to use these dashboards for real investigations and customize them for your needs.

2. Automated Data Tiering (The Architecture)

We've built routes in a Stream pack that automatically:

Analyze incoming web logs to determine business impact
Route critical events (5xx errors, auth failures) to Tier 1 SIEM
Send security/UX events to Tier 2 Lakehouse
Archive everything else to Tier 3 for compliance
Reduce downstream costs by 70-90%

You'll learn how the tiering logic works, experiment with the routes, and adapt them to your data.

Prerequisites

You should have basic familiarity with:

Cribl Stream concepts (Sources, Destinations, Routes, Pipelines)
Cribl Lake and Cribl Search fundamentals
Log analysis and filtering expressions

If you're brand new to Cribl Lake, we recommend checking out the Cribl Lake Overview Sandbox first.

Conventions Used in This Course

Throughout this course, we'll use the following formatting:

Do This Now

These blocks contain step-by-step instructions. Follow them carefully!

Good to Know

Optional context and additional information that helps you understand the "why" behind what you're doing.

Don't Skip This

Critical information that will save you from headaches later.

Pro Tip

Advanced techniques and shortcuts for power users.

Course Philosophy

Search First, Send Second: Don't export everything and search later. Search first, export only what matters.

Tag Everything: Metadata drives intelligent routing. The more you tag, the smarter your system becomes.

Cost-Conscious Architecture: Every event has a cost. Route based on value, not volume.

Human-in-the-Loop: Automation handles the bulk work, but analysts make the critical decisions.

Time to Get Started

Ready to build a data architecture that's both powerful and cost-effective?

Let's dive in.

Next: Setting Up Your Environment →

TL;DR​

What You'll Learn​

The Challenge​

The Solution: Smart Tiering​

What You'll Learn​

1. Interactive Forensic Dashboard (Tips & Tricks)​

2. Automated Data Tiering (The Architecture)​

Prerequisites​

Conventions Used in This Course​

Course Philosophy​

Time to Get Started​

TL;DR

What You'll Learn

The Challenge

The Solution: Smart Tiering

What You'll Learn

1. Interactive Forensic Dashboard (Tips & Tricks)

2. Automated Data Tiering (The Architecture)

Prerequisites

Conventions Used in This Course

Course Philosophy

Time to Get Started