Skip to main content

Because of where AppScope sits in an application, we can easily observe all the traffic coming to and from the filesystem. This may be interesting for basic observability, to be able to answer questions like: "Why is this application consuming so much filesystem I/O?" "What files does this application open?" "What configuration files does this application use?" As noted earlier, we have nginx running in the background. Let's use the scope CLI paired with some basic command line utilities to answer the question "What files has this application opened?"

important

Nginx Open Files

  1. In Terminal 1, run:
    scope events --id 1 -s fs.open

This gives us the last 20 files nginx has opened. We're using --id to tell scope to access session 1, which is running nginx in the background. -s tells scope to filter to source fs.open, which contains only filesystem open events. We can scroll through lists of files that nginx has opened, but we can also pair up the scope CLI with other utilities like jq and classic utilities like sort and uniq to answer "What files has this application opened?" more definitively.

important

Nginx Open Files

  1. In Terminal 1, run:
    scope events --id 1 -a -j | jq -r '.data.file' | sort | uniq

A few important things to note about this slightly more complex command:

  • -a says to output all events, not just the default last 20.
  • -j outputs events as JSON.
  • jq filters down to just the file names out of those events.
  • sort and uniq help us find only the unique filenames that have been opened.

This is very powerful. We can take structured data and using basic utilities answer an interesting question. Instead of guessing, we know that nginx is opening up SSL Cert Keys, OpenSSL configurations, /etc/passwd, its own configurations, and log files. In addition, we're doing this on one of the most popular web servers, written in C, which has historically been extremely difficult to instrument. Lastly, in order to get this info, again, we only needed to prepend scope to the nginx command.

Log Files

One of the most important ways to observe an application is by reading its log files. Without AppScope, usually this data is collected by a log agent configured to tail log files written by the application. This is tried and true and it's working well on millions upon millions of hosts. But a few problems with this approach are emerging. In containerized workloads, we can't easily have one agent collect application logs for every application, because those logs are buried in a container.

As a result, we're seeing a pattern emerge of attaching sidecars to those running containers to pick up log files in the container. This works, but it requires running an agent that was designed to collect many log files from an OS instance and instead running many many copies of it, once per application. Sidecars are consuming very significant resources in containerized deployments because log agents weren't designed to scale down that small.

AppScope makes this significantly better. Because AppScope is inside the application, it sees all the bytes the application writes to the filesystem as it's writing them. With simple heuristics, we can detect that data is log data, and write it to disk in a structured way or forward it along to a logging tool. In the preview section, as we were looking at scope.yml, there was a relevant configuration:

note

This is not an action to copy/paste or execute, it's merely showing an example configuration:

  watch:
- type: file
name: '[\s\/\\\.]log[s]?[\/\\\.]?'
value: .*

This regular expression configures AppScope to watch for file traffic to files that contain log in the path or the filename. Let's see how this works in practice. We have a simple script in our environment to demonstrate how this works, log.py. Let's look at log.py:

important

Simple Python Script

  1. In Terminal 1, run:
    bat log.py

You may not know Python, but the script you've just listed should be pretty easy to read. This script outputs data to two files. One file, wontsee.txt, does not contain log in the name, so AppScope will see the traffic but not the contents. But willsee.log does contain log in the name, so AppScope will output its contents as well. Let's see it in action, by running the script we just listed, and then scope'ing its execution.

important

Scoping Python

  1. In Terminal 1, run:
    scope python3 log.py
  2. In Terminal 1, run:
    scope events -s fs.close 

There are a number of notable things from this scope session. First, python opens a lot of files, and we can see that easily without needing to do anything to python to observe it. Second, we can see that our script does open willsee.log and wontsee.txt. Now, let's look at the file event and see the contents:

important

Log Event

  1. In Terminal 1, run:
    scope events -t file

There, we can clearly see the log message output by our simple script. Using libscope.so as an agent, we can easily now pick up log data with essentially zero configuration. Often, as an operator or security person, we don't even know where our application is writing log data. With AppScope, we can eliminate the need to know every log file being written because we can see it as it's being written. We can also eliminate resource waste created by sidecars with our lightweight instrumentation library.

Filesystem and log traffic is super interesting, but scope really starts to get interesting as we can observe an application's interaction with others over the network. Next, we'll start looking at one of the simplest network programs, nc.