⚠️ We are currently working on improvements of this book. Watch the GitHub repository or wait for a release announcement on rOpenSci blog.

15 Caching HTTP requests

Record HTTP calls and replay them

15.1 Package documentation

Check out https://docs.ropensci.org/vcr/ for documentation on vcr functions.

15.2 Terminology

  • http: hyptertext transfer protocol
  • vcr: the name comes from the idea that we want to record something and play it back later, like a VCR
  • cassette: A thing to record HTTP interactions to. Right now the only option is file system, but in the future could be other things, e.g. a key-value store like Redis
  • Persister: defines how to save requests - currently only option is the file system
  • Serializers: defines how to serialize the HTTP response; that is, how the data is stored on whatever the persister is (right now only file system). Currently only option is YAML; other options in the future could include e.g. JSON
  • insert cassette: create a cassette (all HTTP interactions will be recorded to this cassette). once a cassette is inserted, we don’t allow insertion of additional cassettes
  • eject cassette: eject the cassette (no longer recording to that cassette). however, if any interactions were written to disk, those are still stored there
  • replay: refers to using a cached result of an http request that was recorded earlier
  • recording: this means you’ve set vcr in a mode in which we can record HTTP interactions. sometimes recording can be not possible given user configuration or otherwise

15.3 Basic usage


library(vcr)
library(crul)

cli <- crul::HttpClient$new(url = "https://api.crossref.org")
system.time(
  use_cassette(name = "helloworld", {
    cli$get("works", query = list(rows = 3))
  })
)
#>    user  system elapsed 
#>   0.081   0.006  30.217

The request gets recorded, and all subsequent requests of the same form used the cached HTTP response, and so are much faster


system.time(
  use_cassette(name = "helloworld", {
    cli$get("works", query = list(rows = 3))
  })
)
#>    user  system elapsed 
#>   0.103   0.005   0.108

Importantly, your unit test deals with the same inputs and the same outputs - but behind the scenes you use a cached HTTP resonse - thus, your tests run faster.

The cached response looks something like (condensed for brevity):

http_interactions:
- request:
    method: get
    uri: https://api.crossref.org/works?rows=3
    body:
      encoding: ''
      string: ''
    headers:
      User-Agent: libcurl/7.54.0 r-curl/3.2 crul/0.5.2
      Accept-Encoding: gzip, deflate
      Accept: application/json, text/xml, application/xml, */*
  response:
    status:
      status_code: '200'
      message: OK
      explanation: Request fulfilled, document follows
    headers:
      status: HTTP/1.1 200 OK
      content-type: application/json;charset=UTF-8
      vary: Accept
      access-control-allow-origin: '*'
      access-control-allow-headers: X-Requested-With
      content-length: '5360'
      server: http-kit
      date: Sat, 28 Apr 2018 15:12:29 GMT
      x-rate-limit-limit: '50'
      x-rate-limit-interval: 1s
      connection: close
    body:
      encoding: UTF-8
      string: '{"status":"ok","message-type":"work-list","message-version":"1.0.0","message":{"facets":{},"total-results":96454147,"items":[{"indexed":{"date-parts":[[2017,10,23]],"date-time":"2017-10-23T19:27:25Z","timestamp":1508786845330},"reference-count":0,"publisher":"Elsevier
        BV","issue":"3","license":[{"URL":"http:\/\/www.elsevier.com\/tdm\/userlicense\/1.0\/","start":{"date-parts":[[1999,10,1]],"date-time":"1999-10-01T00:00:00Z","timestamp":938736000000},"delay-in-days":0,"content-version":"tdm"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Progress
        in Planning"],"published-print":{"date-parts":[[1999,10]]},"DOI":"10.1016\/s0305-9006(99)00007-0","type":"journal-article","created":{"date-parts":[[2002,7,26]],"date-time":"2002-07-26T00:11:38Z","timestamp":1027642298000},"page":"vii","source":"Crossref","is-referenced-by-count":0,"title":["Editorial"],"prefix":"10.1016","volume":"52","member":"78","container-title":["Progress
        in Planning"],"link":[{"URL":"http:\/\/api.elsevier.com\/content\/article\/PII:S0305900699000070?httpAccept=text\/xml","content-type":"text\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/api.elsevier.com\/content\/article\/PII:S0305900699000070?httpAccept=text\/plain","content-type":"text\/plain","content-version":"vor","intended-application":"text-mining"}],"deposited":{"date-parts":[[2015,9,9]],"date-time":"2015-09-09T06:07:58Z","timestamp":1441778878000},"score":1.0,"issued":{"date-parts":[[1999,10]]},"references-count":0,"alternative-id":["S0305900699000070"],"URL":"http:\/\/dx.doi.org\/10.1016\/s0305-9006(99)00007-0","ISSN":["0305-9006"],"issn-type":[{"value":"0305-9006","type":"print"}],"subject":["Geography,
        Planning and Development"]},{"indexed":{"date-parts":[[2017,10,23]],"date-time":"2017-10-23T19:27:25Z","timestamp":1508786845347},"reference-count":0,"publisher":"Elsevier
        BV","issue":"4","license":[{"URL":"http:\/\/www.elsevier.com\/tdm\/userlicense\/1.0\/","start":{"date-parts":[[1998,12,1]],"date-time":"1998-12-01T00:00:00Z","timestamp":912470400000},"delay-in-days":0,"content-version":"tdm"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Progress
        in Planning"],"published-print":{"date-parts":[[1998,12]]},"DOI":"10.1016\/s0305-9006(98)00020-8","type":"journal-article","created":{"date-parts":[[2002,7,26]],"date-time":"2002-07-26T00:11:38Z","timestamp":1027642298000},"page":"VI","source":"Crossref","is-referenced-by-count":0,"title":["Preface"],"prefix":"10.1016","volume":"50","author":[{"given":"Maurice","family":"Yeats","affiliation":[]}],"member":"78","container-title":["Progress
        in Planning"],"link":[{"URL":"http:\/\/api.elsevier.com\/content\/article\/PII:S0305900698000208?httpAccept=text\/xml","content-type":"text\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/api.elsevier.com\/content\/article\/PII:S0305900698000208?httpAccept=text\/plain","content-type":"text\/plain","content-version":"vor","intended-application":"text-mining"}],"deposited":{"date-parts":[[2015,9,9]],"date-time":"2015-09-09T06:07:58Z","timestamp":1441778878000},"score":1.0,"issued":{"date-parts":[[1998,12]]},"references-count":0,"alternative-id":["S0305900698000208"],"URL":"http:\/\/dx.doi.org\/10.1016\/s0305-9006(98)00020-8","ISSN":["0305-9006"],"issn-type":[{"value":"0305-9006","type":"print"}],"subject":["Geography,
        Planning and Development"]},{"indexed":{"date-parts":[[2017,10,23]],"date-time":"2017-10-23T19:27:25Z","timestamp":1508786845389},"reference-count":24,"publisher":"Elsevier
        BV","issue":"1","license":[{"URL":"http:\/\/www.elsevier.com\/tdm\/userlicense\/1.0\/","start":{"date-parts":[[1999,1,1]],"date-time":"1999-01-01T00:00:00Z","timestamp":915148800000},"delay-in-days":0,"content-version":"tdm"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Powder
        Technology"],"published-print":{"date-parts":[[1999,1]]},"DOI":"10.1016\/s0032-5910(98)00119-3","type":"journal-article","created":{"date-parts":[[2002,7,26]],"date-time":"2002-07-26T00:11:20Z","timestamp":1027642280000},"page":"17-30","source":"Crossref","is-referenced-by-count":0,"title":["Load
        diversion by embedding in crushed salt"],"prefix":"10.1016","volume":"101","author":[{"given":"W.","family":"Feuser","affiliation":[]},{"given":"H.","family":"Vijgen","affiliation":[]},{"given":"E.","family":"Barnert","affiliation":[]}],"member":"78","container-title":["Powder
        Technology"],"link":[{"URL":"http:\/\/api.elsevier.com\/content\/article\/PII:S0032591098001193?httpAccept=text\/xml","content-type":"text\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/api.elsevier.com\/content\/article\/PII:S0032591098001193?httpAccept=text\/plain","content-type":"text\/plain","content-version":"vor","intended-application":"text-mining"}],"deposited":{"date-parts":[[2017,6,15]],"date-time":"2017-06-15T02:22:06Z","timestamp":1497493326000},"score":1.0,"issued":{"date-parts":[[1999,1]]},"references-count":24,"alternative-id":["S0032591098001193"],"URL":"http:\/\/dx.doi.org\/10.1016\/s0032-5910(98)00119-3","ISSN":["0032-5910"],"issn-type":[{"value":"0032-5910","type":"print"}],"subject":["General
        Chemical Engineering"]}],"items-per-page":3,"query":{"start-index":0,"search-terms":null}}}'
  recorded_at: 2018-04-28 15:12:29 GMT
  recorded_with: vcr/0.0.8.9521, webmockr/0.2.2.9119, crul/0.5.2

All components of both the request and response are preserved, so that the HTTP client (in this case crul) can reconstruct its own response just as it would if it wasn’t using vcr.

15.4 vcr enabled testing

15.4.1 check vs. test

TLDR: Run devtools::test() before running devtools::check()

When running tests or checks of your whole package, note that you’ll get different results with devtools::check() vs. devtools::test(). This arises because devtools::check() runs in a temporary directory and files created (vcr cassettes) are only in that temporary directory and thus don’t persist after devtools::check() exits.

However, devtools::test() does not run in a temporary directory, so files created (vcr cassettes) are in whatever directory you’re running it in.

Alternatively, you can run devtools::test_file() to create your vcr cassettes.

The same goes for using the RStudio IDE buttons/keyboard shortcuts.

15.4.2 CRAN

There is no one right answer to how to manage your tests for CRAN. The following is a discussion of the various considerations - which should give the reader enough information to make an educated decision.

You can run vcr enabled tests on CRAN. CRAN is okay with files associated with tests, and so in general you can run your vcr enabled tests on CRAN just as you run them locally.

If you do run your vcr enabled tests on CRAN be aware of a few things:

  • If your vcr enabled tests require any environment variables or R options, they won’t be available on CRAN. In these cases you likely want to skip these tests.
  • If your vcr enabled tests have cassettes with sensitive information in them, you probably do not want to have those cassettes on the internet, in which case you won’t be running vcr enabled tests on CRAN either. In the case of sensitive information, you’ll want to read the gitignore cassettes section

If you are worried at all about problems with vcr enabled tests on CRAN you can use testthat::skip_on_cran() to skip specific tests.

15.4.3 CI sites: Travis, Appveyor, etc.

Similar considerations are involved when dealing with Travis, Appveyor, Circle-CI, etc.

There are a few differences. With CI services you can manage environment variables so you can run tests that require API keys, etc. In addition, if tests fail you aren’t threatened with your package being taken down.