Block unwanted requests to a site
The VIP_Request_Block
utility class can be used to block unwanted origin requests to a site such as those from a bot crawling the site or a malicious IP address. A 403
HTTP Response Status Code is returned for requests that are blocked by VIP_Request_Block
.
Considerations
- Requests blocked via
VIP_Request_Block
are blocked at the origin, not the edge (load balancer). If a request is served from the cache at the edge, it does not reach the origin and cannot be blocked by this class. - The
VIP_Request_Block
utility class is only effective for requests that require WordPress to be loaded (e.g.example.com/blog/post-name
). The utility class is ineffective for direct file requests (e.g.example.com/wp-includes/blocks/index.php
). - Requests that are blocked by a
VIP_Request_Block
are chargeable requests. - Code to block a request should be added to
vip-config/vip-config.php
. This ensures that on VIP Platform environments, requests that are intended to be blocked are blocked early viawp-config.php
before WordPress Core is loaded. - When developing on a local development environment the load order might differ from VIP Platform environments. It is therefore recommended to wrap statements in an
if ( class_exists( 'VIP_Request_Block' ) )
check to avoid errors. - Be careful not to block legitimate traffic (e.g., Googlebot, a reverse proxy, or a CDN). Always take time to confirm that an IP address, User-Agent, or HTTP header is suspicious before blocking it.
Block by IP
Use the VIP_Request_Block::ip()
method to block single IP addresses. IP ranges cannot be blocked with this method. If VIP_Request_Block::ip()
is called with an invalid IP address, an error will be logged to the application’s Runtime Logs.
The whois
terminal command can be used to query an IP address in order to make a more educated decision about which IP addresses are suitable to be blocked.
Use caution to avoid blocking a reverse proxy IP instead of the client’s IP. Blocking a reverse proxy IP will result in legitimate traffic being blocked.
// Example VIP_Request_Block::ip( string $value );
if ( class_exists( 'VIP_Request_Block' ) ) {
VIP_Request_Block::ip( '13.37.13.37' );
VIP_Request_Block::ip( '13.37.13.38' );
}
Block by User-Agent
It can be useful to block by User-Agent in cases where a tool or bot sends suspicious requests from various IP addresses while retaining the same User-Agent. WordPress VIP provides two User-Agent blocking methods for WordPress applications.
When blocking by User-Agents, take care not to block common browser User-Agents, or User-Agents that look similar. It is a best practice to search recent HTTP Request Logs to validate that full, or partial matching blocking will only impact the intended requests. Blocking requests from common browser User-Agents can result in blocking real user requests.
Block by full User-Agent match
The VIP_Request_Block::ua()
method can be used when the full text of an unwanted User-Agent is known and uses case-sensitive matching.
// Example VIP_Request_Block::ua( string $user_agent );
if ( class_exists( 'VIP_Request_Block' ) ) {
VIP_Request_Block::ua( 'SuspiciousBot/1.1' );
VIP_Request_Block::ua( 'AnotherBot/2.1' );
}
Block by partial User-Agent match
The VIP_Request_Block::ua_partial_match()
method can be used when requests from a User-Agent which contains a sub-string need to be blocked. This method does not support regular expressions, and the string matching is case-sensitive.
// Example VIP_Request_Block::ua_partial_match( string $user_agent_substring );
// Will match and block for:
// - SuspiciousBot/1.1
// - SomewhatSuspiciousBot/1.8 - https://example.com/robot-policy
if ( class_exists( 'VIP_Request_Block' ) ) {
VIP_Request_Block::ua_partial_match( 'SuspiciousBot/' );
}
Block by partial User-Agent match for AI crawlers
To block requests by artificial intelligence (AI) crawlers for a site’s content, a disallow rule should be set in a site’s robots.txt
. An additional measure can be put in place to block AI crawlers by their User-Agents with the VIP_Request_Block::ua_partial_match()
method.
This code example demonstrates a custom function that blocks requests from User-Agents of 4 well-known AI crawlers (e.g. OpenAI’s GPTBot) without blocking requests to the site’s robots.txt
:
function my_block_ai_user_agents() {
if ( ! class_exists( 'VIP_Request_Block' ) ) {
return;
}
// Do not block access to robots.txt
if ( isset( $_SERVER['REQUEST_URI'] )
&& true === str_contains( $_SERVER['REQUEST_URI'], '/robots.txt' ) ) {
return;
}
// OpenAI GPTBot crawler (https://platform.openai.com/docs/gptbot)
VIP_Request_Block::ua_partial_match( 'GPTBot/' );
// OpenAI ChatGPT service (https://platform.openai.com/docs/plugins/bot)
VIP_Request_Block::ua_partial_match( 'ChatGPT-User/' );
// Common Crawl crawler (https://commoncrawl.org/faq)
VIP_Request_Block::ua_partial_match( 'CCBot/' );
// Google Bard / Gemini crawler (https://developers.google.com/search/docs/crawling-indexing/overview-google-crawlers)
VIP_Request_Block::ua( 'Google-Extended' );
}
my_block_ai_user_agents();
Only 4 AI crawlers are included in this code example, though far more exist. Customers should research which AI crawler User-Agents should be disallowed for their site and include them in a modified version of this code example.
Block by HTTP header
When blocking by HTTP header:
$header
is case-insensitive, but$value
is case-sensitive.- Format the
$header
string value as it appears when returned by a cURL request or in a browser’s inspector tool console (e.g. Chrome’s Developer Tools).
In this code example, the $header
string value is X-My-Header
. Because $header
is case-insensitive, this value could also be formatted as x-my-header
.
// Example VIP_Request_Block::header( string $header, string $value );
if ( class_exists( 'VIP_Request_Block' ) ) {
VIP_Request_Block::header( 'X-My-Header', 'my-header-value' );
}
Block by an X-ASN
HTTP header
An X-ASN
header is included in requests made to a site on the VIP Platform, and can be used by PHP. Use this header to block requests by Autonomous System Number (ASN) of an IP address. The ASN value for an IP address can be retrieved with a tool such as MXToolbox.
In this example, IP addresses belonging to the ASN 1234
will be blocked at origin:
// Example VIP_Request_Block::header( 'x-asn', $blocked_asn_number );
if ( class_exists( 'VIP_Request_Block' ) ) {
VIP_Request_Block::header( 'x-asn', '1234' );
}
Block by other logic
If the above methods are insufficient, a request can be blocked based on custom logic with the VIP_Request_Block::block_and_log()
method. This method blocks a request and logs a message to the error_log
. The condition in which a given request will be blocked must be defined using custom logic.
Error output can be retrieved with Runtime Logs in the VIP Dashboard or with VIP-CLI.
The following is an example of blocking a request to a specific URL:
// Example VIP_Request_Block::block_and_log( string $value, string $criteria );
if ( isset( $_SERVER['REQUEST_URI'] ) && '/disallowed-path' === $_SERVER['REQUEST_URI'] ) {
if ( class_exists( 'VIP_Request_Block' ) ) {
VIP_Request_Block::block_and_log( '/disallowed-path', 'REQUEST URI' );
}
}
In the example above, requests made to /disallowed-path
will produce the following message in the error_log
:
VIP Request Block: request was blocked based on "REQUEST_URI" with value of "/disallowed-path"
Logs
By default, requests that are blocked with the VIP_Request_Block
class are logged in the environment’s Runtime Logs. Logging can be disabled for requests blocked by VIP_Request_Block
methods with the disable_logging()
function.
Call disable_logging()
just before calling the VIP_Request_Block
method(s) that should not be logged. Re-enable logging for additional methods by calling enable_logging()
then call the methods that should have logging enabled.
In this code example, logging is disabled for blocked requests from the IP address 13.37.13.37
and enabled for requests from User Agent SuspiciousBot/1.1
:
// logging for blocked requests is enabled by default
if ( class_exists( 'VIP_Request_Block' ) ) {
// disable logging
VIP_Request_Block::disable_logging();
VIP_Request_Block::ip( '13.37.13.37' );
// enable logging
VIP_Request_Block::enable_logging();
VIP_Request_Block::ua( 'SuspiciousBot/1.1' );
}
Last updated: May 22, 2024