⚠️ We are currently working on improvements of this book. Watch the GitHub repository or wait for a release announcement on rOpenSci blog.

21 security

21.1 API keys and such

The configuration parameter filter_sensitive_data accepts a named list.

Each element in the list should be of the following format:

thing_to_replace_it_with = thing_to_replace

We replace all instances of thing_to_replace with thing_to_replace_it_with.

Before recording (writing to a cassette) we do the replacement and then when reading from the cassette we do the reverse replacement to get back to the real data.

The before record replacement happens in an internal function write_interactions(), while before playback replacement happens in internal function YAML$deserialize_path()


vcr_configure(
  filter_sensitive_data = list("<<<my_api_key>>>" = Sys.getenv('API_KEY'))
)

You want to make the string that replaces your sensitive string something that won’t be easily found elsewhere in the response body/headers/etc.

It’s a good idea to not in place of thing_to_replace put your actual sensitive key thing, because that defeats the purpose of trying to protect your private data. This is why we highly recommend setting your API keys as environment variables, then you can as seen above just put a call to Sys.getenv(), which we’ll use internally to get your key, find it anywhere in the HTTP responses, and replace it with your placeholder string.

The reason you want to do this is because you may on purpose or on accident push your cassettes to the public web, and when that happens you don’t want your private keys in those cassettes.

Note that the way this is implemented in vcr is not super elegant and is not general with respect to the serializer. We only support YAML serializing right now, but when we support other serializers we’ll need to change the implementation.

21.2 API keys and tests run in varied contexts

When vcr enabled tests are run in different contexts (laptops, CI services, containers, etc.), and those tests can optionally use authentication you can run into problems. An example will best illustrate the issue.

Given an R package foo, the package maintainer sets up tests with vcr. Functions in package foo can optionally use an API key supplied by the user. For the purposes of this example, the API key when given is included as a query parameter (it is not good pratice to pass API keys as query parameter, but its not uncommon).

When the maintainer runs tests locally on their own machine they can unset the environment variable that holds their API key so its not included in vcr cassettes. Additionally, when they run tests on CI systems (e.g., Travis-CI), they do not have an API key set. When tests are run in either of these two locations, the API key is not included in the requests, and thus not included in the cassettes.

Now, consider a contributor that forks the repository and as a first run through the package installs the package, then runs tests. If this contributor does not have an API key set, the tests should run fine. However, once the contributor sets their API key and attempts to run tests (or imagine another contributor that already has an API key set), the tests will fail because the URI now contains an API key as a query parameter in the URL.

There’s various combinations of the above problem.

One option for dealing with this is requiring an API key to be set. If you do this you likely want to make sure your actual key is not in the cassettes that will be pushed to the public web; see the bit about filter_sensitive_data above for that. If you do require an API key, then you can ensure that whenever tests are run an API key is required; that way there’s no way that tests will be run with AND without a key - which can lead to problems. Requiring an API key means you can’t run tests in environments where it’s not possible to safely set a key, e.g., on CRAN (in which case you must skip tests).

There’s no magic way around this problem ~ it’s a good thing to be aware of, especially if other people contribute to your package.

21.3 Other security

Let us know about any other security concerns! Surely there’s things we haven’t considered yet.