2 HTTP in R 101
2.1 What is HTTP?
HTTP means HyperText Transport Protocol, but you were probably not just looking for a translation of the abbreviation. HTTP is a way for you to exchange information with a remote server. In your package, if information is going back and forth between the R session and internet, you are using some sort of HTTP tooling. Your package is making requests and receives responses.
2.1.1 HTTP requests
The HTTP request is what your package makes.
It has a method (are you fetching information via GET
? are you sending information via POST
?), different parts of an URL (domain, endpoint, query string), headers (containing e.g. your secret identifiers).
It can contain a body, for instance you might be sending data as JSON.
In that case one of the header will describe the content.
How do you know what request to make from your package? Hopefully you are interacting with a well documented web resource that will explain to you what methods are associated with what endpoints.
2.1.2 HTTP responses
The HTTP response is what the remote server provides, and what your package parses. A response has a status code indicating whether the request succeeded, response headers, and (optionally) a response body.
Hopefully the documentation of the web API you are working with shows good examples of responses. In any case you’ll find yourself experimenting with different requests to see what the response “looks like.”
2.1.3 More resources about HTTP
How do you get started with interacting with HTTP in R?
2.1.3.1 General HTTP resources
- Mozilla Developer Network docs about HTTP (recommended in the zine mentioned thereafter)
- (not free) Julia Evans’ Zine “HTTP: Learn your browser’s language!”
- The docs of the web API you are aiming to work with, and a search engine to understand the words that are new.
2.2 HTTP requests in R: what package?
In R, to interact with web resources, it is recommended to use curl; or its higher-level interfaces httr (pronounced hitter or h-t-t-r) or crul.
Do not use RCurl, because it is not actively maintained!
When writing a package interacting with web resources, you will probably use either httr or crul.
httr is the most popular and oldest of the two, and supports OAuth. httr docs feature a vignette called Best practices for API packages
crul does not support OAuth but it uses an object-oriented interface, which you might like. crul has a set of clients, or ways to perform requests, that might be handy. crul also has a vignette about API package best practices.
Below we will try to programmatically access the status of GitHub, the open-source platfrom provided by the company of the same name. We will access the same information with httr and crul. If you decide for the low-level curl, feel free to contribute an example.
github_url <- "https://kctbh9vrtdwd.statuspage.io/api/v2/status.json"
The URL above leaves no doubt as to what format the data is provided in, JSON!
Let’s first use httr.
response <- httr::GET(github_url)
# Check the response status
httr::http_status(response)
## $category
## [1] "Success"
##
## $reason
## [1] "OK"
##
## $message
## [1] "Success: (200) OK"
# Or in a package you'd just write
httr::stop_for_status(response)
# Parse the content
httr::content(response)
## $page
## $page$id
## [1] "kctbh9vrtdwd"
##
## $page$name
## [1] "GitHub"
##
## $page$url
## [1] "https://www.githubstatus.com"
##
## $page$time_zone
## [1] "Etc/UTC"
##
## $page$updated_at
## [1] "2020-11-23T08:18:02.507Z"
##
##
## $status
## $status$indicator
## [1] "none"
##
## $status$description
## [1] "All Systems Operational"
# In case you wonder, the format was obtained from a header
httr::headers(response)$`content-type`
## [1] "application/json; charset=utf-8"
Now, the same with crul.
# Create a client and get a response
client <- crul::HttpClient$new(github_url)
response <- client$get()
# Check the response status
response$status_http()
## <Status code: 200>
## Message: OK
## Explanation: Request fulfilled, document follows
# Or in a package you'd just write
response$raise_for_status()
# Parse the content
response$parse()
## No encoding supplied: defaulting to UTF-8.
## [1] "{\"page\":{\"id\":\"kctbh9vrtdwd\",\"name\":\"GitHub\",\"url\":\"https://www.githubstatus.com\",\"time_zone\":\"Etc/UTC\",\"updated_at\":\"2020-11-23T08:18:02.507Z\"},\"status\":{\"indicator\":\"none\",\"description\":\"All Systems Operational\"}}"
jsonlite::fromJSON(response$parse())
## No encoding supplied: defaulting to UTF-8.
## $page
## $page$id
## [1] "kctbh9vrtdwd"
##
## $page$name
## [1] "GitHub"
##
## $page$url
## [1] "https://www.githubstatus.com"
##
## $page$time_zone
## [1] "Etc/UTC"
##
## $page$updated_at
## [1] "2020-11-23T08:18:02.507Z"
##
##
## $status
## $status$indicator
## [1] "none"
##
## $status$description
## [1] "All Systems Operational"
Hopefully these very short snippets give you an idea of what syntax to expect when choosing one of those packages.
Note that the choice of a package will constrain the HTTP testing tools you can use. However, the general ideas will remain the same. You could switch your package backend from say crul to httr without changing your tests, if your tests do not test too many specifities of internals.