Skip to content

How-to Guides

Technical References

Tools for Site Management /

HTTP Request Log Shipping on VIP Go

VIP’s Log Shipping feature allows you to automatically save HTTP request logs to an Amazon Web Services S3 bucket at 5-minute intervals. The logs are then available to your team and contractors for storage, process, or analysis. Logs are an important asset in understanding the use of your system, connectivity issues, performance tuning, usage patterns, and in analysing service interruptions.

Currently we only provide Log Shipping for your HTTP (web) Request logs.

Requirements

You will need:

  • An AWS S3 bucket, make a note of the the bucket name and region
  • Access to create/update the AWS Bucket Policy configuration for the bucket

Configuration

  1. Get the name of your AWS bucket and region.
  2. Enter it into the dashboard under Settings > Log Shipping
  1. The dashboard will generate a config file in JSON format that you need to paste into your AWS Bucket Policy configuration. For the desired bucket, navigate to “Permissions,” then select “Bucket Policy.” The JSON file can be saved there.
  2. Once the configuration information is entered into the dashboard, a test file will be sent to the bucket. Note that a test file is uploaded as part of the verification process, aptly named vip-go-test-file.txt. This file will always be present in a sites configured bucket and path, alongside the date folders that contain the logs themselves.

The path used to write to the bucket is [bucket]/[app_name]/[app_environment], e.g. my-log-bucket/my-app/production. This means that you can use the same bucket for more than one app or environment, should you choose to do so.

Objects written to the specified S3 bucket are done so with the bucket-owner-full-control canned ACL.

Restricting access by IP range

If you want to restrict access to your AWS S3 bucket via IP range, ensure your bucket access policy accounts for the dynamic IP range accessible at https://go-vip.net/ip-ranges.json. You will need to implement a system to auto-update the access policy, as the IP ranges are subject to change.

Log contents

The log files are written as a series of gzipped JSON files. Here is a sample record:

{
  "client_site_id": "000",
  "remote_user": "",
  "request_url": "/",
  "wplogin": "-",
  "timestamp": "19/May/2020:17:03:58 +0000",
  "request_type": "GET",
  "scheme": "https",
  "http_referer": "https://example/",
  "http_x_forwarded_for": "",
  "true_client_ip": "",
  "remote_addr": "REDACTED",
  "tls_version": "TLSv1.3",
  "content_type": "text/html; charset=UTF-8",
  "upstream_country_code": "GB",
  "sent_cache_control": "max-age=300, must-revalidate",
  "timestamp_iso8601": "2020-05-19T17:03:58+00:00",
  "sent_vary": "Accept-Encoding",
  "sent_x_cache": "hit",
  "request_time": "0.001",
  "http_host": "example.com",
  "http_accept_language": "en-US,en;q=0.9",
  "http_user_agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36",
  "http_version": "HTTP/2.0",
  "body_bytes_sent": "8981",
  "status": "200"
}

Description of fields

body_bytes_sent—total number of bytes sent to the client

client_site_id—an internal ID unique to this environment

content_type—the media type of the resource, e.g. text/html; charset=UTF-8

http_host—the domain, e.g. example.com

http_accept_language—the contents of the Accept-Language request HTTP header

http_user_agent—the contents of the User-Agent request header

http_version—HTTP protocol version

http_referer—the Referer request header, if available, containing the purported address of the previous web page from which a link to the currently requested page was followed

http_x_forwarded_for—a header that is a means of logging a client’s originating IP address

remote_user—the username if the request was authenticated with HTTP Basic Authentication (we don’t log the password)

request_url—the path of the resources that was fetched, not including elements that are included elsewhere, e.g. the protocol (e.g. http://, see `scheme`), and the domain (e.g. example.com, see http_host)

request_time—the time taken for the request

request_type—the HTTP method

sent_cache_control—the contents of the Cache-Control HTTP response header

sent_x_cache—a header from the VIP platform indicating whether the response was from a cache hit, miss, or pass

scheme—either http or https

sent_vary—The contents of the Vary HTTP response header, note that we do not allow free use of the Vary header, e.g. Accept-Encoding,

status—the HTTP response status code, e.g. 200, 404, etc.

timestamp—UTC date and time of request

timestamp_iso8601—UTC date and time of request in ISO format

true_client_ip—a request header commonly set by reverse proxies, including Cloudflare, to indicate the remote address of the client they are forwarding requests for, see also http_x_forwarded_for  there is no formally agreed specification and VIP Reverse Proxies documentation

remote_addr—IP address of the client making the request (see also true_client_ip and http_x_forwarded_for)

tls_version—TLS version used by the client

upstream_country_code—all requests are geocoded by country at the edge of the VIP CDN using the incoming IP address, e.g. “US”, “GB”, etc

wplogin—the login name (i.e. user_login) of the authenticated WordPress user, if any; requests where there is no authenticated WordPress user this field will contain -

Using your log data

The JSON formatted log files are readable individually by humans, but to make full use of your logs you will need to ingest them into another service. Here are some examples of platforms that will help you make the most of your data, depending on your use cases:

ELK (Elasticsearch, Logstash, Kibana) will help you filter and view your logs

Splunk will help you search, monitor, and analyse the data from your logs

Data Dog will help you understand development issues within your logs

Botify will help you understand SEO issues revealed by your log data

Last updated: October 08, 2020