Searching APIs (cont.)
Fine work. And now to process the data we'll create our dataset,
Generic API Dataset Creation
- Ensure the
Data
tab is still selected in the top navigation bar. - Click
Datasets
in the left navigation bar. - Click
Add Dataset
. - In the
ID
field entersbx_breachddata_<your first name>
. - In the
Description
field enterSearch Sandbox Breach Data
- Ensure
Dataset Provider
is selected in the left navigation bar. - Click the
Select a Provider
dropdown and select thesbx_hibp_<your first name>
provider that you just created. - Below
Enabled endpoints
clickAdd endpoint
. - Click into the empty box and select
breaches
. - Click
Save
.
Generic API Dataset Processing
Normally, this would be the point in our journey where we'd select datatypes to apply to our dataset however, since this is a custom provider there isn't one that will work for our data. Welp looks like we'll have to create one.
The first part of creating a datatype is to create a ruleset. If you can recall this till be a list of rules that will be used to process the events.
-
Click
Settings
in the top navigation bar. -
Click
Datatypes
. -
Click
Add Ruleset
. -
In the
ID
field enterHIBP Breaches
. -
In the
Description
field enter:HIBP API Breaches Data (array of multi-line JSON)
Rules
Now we need to add a rule to our ruleset. If there were multiple formats or types of events in our data then we would create a rule for each but since our data will come in a single format we'll just create a single rule.
- Click
Add Rule
. - In the
Name
field enterBreaches v3
.
Before we start editing our config options, don't you think it would be great if we could at lease SEE the data? Rhetorical question. Of course!
Cribl Search allows you to load the data from the dataset that you created so that you can verify your data type settings in real time.
- Above the sample pane to the right click
Upload dataset
. - Select the
sbx_breachdata_<your first name>
dataset that you just created.
The data returned by the API is a JSON array and looks like Jabba the Hutt (one giant messy blob), but we're about to fix that.
Event Breaker
-
After the dataset loads, click the dropdown next to
EVENT BREAKER SETTINGS
. -
Ensure
Enabled
is set toYes
. -
In the
Event Breaker type
selectJSON Array
.noteNotice that Cribl Search was able to automatically break the array into individual events and highlights them in the sample.
-
Above the sample pane to the right click
Out
.noteCribl Search also shows the sample in event view.
-
Set
JSON extract fields
toYes
.noteYou should now be able to see all the JSON fields parsed in the sample pane.
-
In
Timestamp field
enterModifiedDate
.
Adding Fields
Great, events are broken out and parsed. Now let's add some metadata to each event because c'mon, who doesn't love some metadata?
-
Click the dropdown next to
ADD FIELDS TO EVENTS
.noteNotice that the datatype field has been added and set for you using the name of the ruleset and the name of the rule. If you'd like you can change that here.
-
Replace the value expression of the
datatype
field to'hibp_breachesv3'
-
Click
Add field
and enter these details
Name:dataSource
Value Expression:'hibp_breaches'
-
Click
Add field
and enter these details
Name:haveibeenpwned
Value Expression:'https://haveibeenpwned.com/'
noteYou should now be able to see our added fields in the sample pane.
-
Click
OK
.
Filtering
Whooops! We get a message saying that it is recommended to use a Filter Condition. This is becasue without a filter condition Cribl Search would have to compare all data from any dataset(s) attached to this ruleset against this rule.
For better performance we'll want to find something unique to all the data that this rule should be applied to and use that as our filter condition.
-
In the
Filter condition
field enter:_raw.includes("PwnCount")
-
Click
OK
. -
Click
Save
.