HTTP request log shipping
VIP’s Log Shipping feature allows you to automatically save HTTP request logs to an Amazon Web Services S3 bucket at 5-minute intervals. The logs are then available to your team and contractors for storage, process, or analysis. Logs are an important asset for understanding the use of your system, connectivity issues, performance tuning, usage patterns, and in analyzing service interruptions.
Currently we only provide Log Shipping for your HTTP (web) request logs.
Requirements
You will need:
- An AWS S3 bucket (make a note of the the bucket name and region)
- Access to create/update the AWS Bucket Policy configuration for the bucket
Configuration
- Get the name of your AWS bucket and region.
- Enter it into the dashboard under Settings > Log Shipping

- The dashboard will generate a config file in JSON format that you need to paste into your AWS Bucket Policy configuration. For the desired bucket, navigate to “Permissions,” then select “Bucket Policy.” The JSON file can be saved there.
- Once the configuration information is entered into the dashboard, a test file will be sent to the bucket. Note that a test file is uploaded as part of the verification process, aptly named
vip-go-test-file.txt
. This file will always be present in a site’s configured bucket and path, alongside the date folders that contain the logs themselves.
The path used to write to the bucket is [bucket]/[app_name]/[app_environment]
(e.g. my-log-bucket/my-app/production
). This means you can use the same bucket for more than one app or environment, should you choose to do so.
Objects written to the specified S3 bucket are done so with the bucket-owner-full-control
canned ACL.
Restricting access by IP range
If you want to restrict access to your AWS S3 bucket via IP range, ensure your bucket access policy accounts for the dynamic IP range accessible at https://go-vip.net/ip-ranges.json. You will need to implement a system to auto-update the access policy, as the IP ranges are subject to change.
Log contents
The log files are written as a series of gzipped JSON files. Here is a sample record:
{
"client_site_id": "000",
"remote_user": "",
"request_url": "/",
"wplogin": "-",
"timestamp": "19/May/2020:17:03:58 +0000",
"request_type": "GET",
"scheme": "https",
"http_referer": "https://example/",
"http_x_forwarded_for": "",
"true_client_ip": "",
"remote_addr": "REDACTED",
"tls_version": "TLSv1.3",
"content_type": "text/html; charset=UTF-8",
"upstream_country_code": "GB",
"sent_cache_control": "max-age=300, must-revalidate",
"timestamp_iso8601": "2020-05-19T17:03:58+00:00",
"sent_vary": "Accept-Encoding",
"sent_x_cache": "hit",
"request_time": "0.001",
"http_host": "example.com",
"http_accept_language": "en-US,en;q=0.9",
"http_user_agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36",
"http_version": "HTTP/2.0",
"body_bytes_sent": "8981",
"status": "200"
}
Description of fields
body_bytes_sent
— total number of bytes sent to the client
client_site_id
— an internal ID unique to this environment
content_type
— the media type of the resource, e.g. text/html; charset=UTF-8
http_host
— the domain, e.g. example.com
http_accept_language
— the contents of the Accept-Language
request HTTP header
http_user_agent
— the contents of the User-Agent
request header
http_version
— HTTP protocol version
http_referer
— the Referer request header, if available, containing the purported address of the web page from which a link to the currently requested page was followed
http_x_forwarded_for
— a header that is a means of logging a client’s originating IP address
remote_user
— the username if the request was authenticated with HTTP Basic Authentication (we don’t log the password)
request_url
— the path of the resources that was fetched, not including elements that are included elsewhere, e.g. the protocol (e.g. http://, see ‘scheme’), and the domain (e.g. example.com
, see http_host
)
request_time
— the time taken for the request
request_type
— the HTTP method
sent_cache_control
— the contents of the Cache-Control
HTTP response header
sent_x_cache
— a header from the VIP platform indicating whether the response was from a cache hit
, miss
, or pass
scheme
— either http
or https
sent_vary
— The contents of the Vary HTTP response header; note that we do not allow free use of the Vary header (e.g. Accept-Encoding
)
status
— the HTTP response status code, e.g. 200, 404, etc.
timestamp
— UTC date and time of request
timestamp_iso8601
— UTC date and time of request in ISO format
true_client_ip
— a request header commonly set by reverse proxies, including Cloudflare, to indicate the remote address of the client they are forwarding requests for (see also: http_x_forwarded_for
)
remote_addr
— IP address of the client making the request (see also: true_client_ip
and http_x_forwarded_for
)
tls_version
— TLS version used by the client
upstream_country_code
— all requests are geocoded by country at the edge of the VIP CDN using the incoming IP address, e.g. “US”, “GB”, etc.
wplogin
— the login name (i.e. user_login
) of the authenticated WordPress user, if any; requests where there is no authenticated WordPress user this field will contain -
Using your log data
The JSON formatted log files are readable individually by humans, but to make full use of your logs you will need to ingest them into another service. We have a tutorial on how to analyze the access log data from our Log Shipping tool using GoAccess.
Here are some other platforms that can help you make the most of your data, depending on your use cases:
- ELK (Elasticsearch, Logstash, Kibana) will help you filter and view your logs
- Splunk will help you search, monitor, and analyze the data from your logs
- Data Dog will help you understand development issues within your logs
- Botify will help you understand SEO issues revealed by your log data