Skip to main content

Searching APIs (cont.)

Fine work. And now to process the data we'll create our dataset,

Generic API Dataset Creation

important
  1. Ensure the Data tab is still selected in the top navigation bar.
  2. Click Datasets in the left navigation bar.
  3. Click Add Dataset.
  4. In the ID field enter sbx_breachddata_<your first name>.
  5. In the Description field enter Search Sandbox Breach Data
  6. Ensure Dataset Provider is selected in the left navigation bar.
  7. Click the Select a Provider dropdown and select the sbx_hibp_<your first name> provider that you just created.
  8. Below Enabled endpoints click Add endpoint.
  9. Click into the empty box and select breaches.
  10. Click Save.

Generic API Dataset Processing

Normally, this would be the point in our journey where we'd select datatypes to apply to our dataset however, since this is a custom provider there isn't one that will work for our data. Welp looks like we'll have to create one.

The first part of creating a datatype is to create a ruleset. If you can recall this till be a list of rules that will be used to process the events.

important
  1. Click Settings in the top navigation bar.

  2. Click Datatypes.

  3. Click Add Ruleset.

  4. In the ID field enter HIBP Breaches.

  5. In the Description field enter:

    HIBP API Breaches Data (array of multi-line JSON)

Rules

Now we need to add a rule to our ruleset. If there were multiple formats or types of events in our data then we would create a rule for each but since our data will come in a single format we'll just create a single rule.

important
  1. Click Add Rule.
  2. In the Name field enter Breaches v3.

Before we start editing our config options, don't you think it would be great if we could at lease SEE the data? Rhetorical question. Of course!

Cribl Search allows you to load the data from the dataset that you created so that you can verify your data type settings in real time.

important
  1. Above the sample pane to the right click Upload dataset.
  2. Select the sbx_breachdata_<your first name> dataset that you just created.

The data returned by the API is a JSON array and looks like Jabba the Hutt (one giant messy blob), but we're about to fix that.

Event Breaker

important
  1. After the dataset loads, click the dropdown next to EVENT BREAKER SETTINGS.

  2. Ensure Enabled is set to Yes.

  3. In the Event Breaker type select JSON Array.

    note

    Notice that Cribl Search was able to automatically break the array into individual events and highlights them in the sample.

  4. Above the sample pane to the right click Out.

    note

    Cribl Search also shows the sample in event view.

  5. Set JSON extract fields to Yes.

    note

    You should now be able to see all the JSON fields parsed in the sample pane.

  6. In Timestamp field enter ModifiedDate.

Adding Fields

Great, events are broken out and parsed. Now let's add some metadata to each event because c'mon, who doesn't love some metadata?

important
  1. Click the dropdown next to ADD FIELDS TO EVENTS.

    note

    Notice that the datatype field has been added and set for you using the name of the ruleset and the name of the rule. If you'd like you can change that here.

  2. Replace the value expression of the datatype field to 'hibp_breachesv3'

  3. Click Add field and enter these details
    Name: dataSource
    Value Expression: 'hibp_breaches'

  4. Click Add field and enter these details
    Name: haveibeenpwned
    Value Expression: 'https://haveibeenpwned.com/'

    note

    You should now be able to see our added fields in the sample pane.

  5. Click OK.

Filtering

Whooops! We get a message saying that it is recommended to use a Filter Condition. This is becasue without a filter condition Cribl Search would have to compare all data from any dataset(s) attached to this ruleset against this rule.

For better performance we'll want to find something unique to all the data that this rule should be applied to and use that as our filter condition.

important
  1. In the Filter condition field enter:

    _raw.includes("PwnCount")
  2. Click OK.

  3. Click Save.