Skip to content

Block unwanted requests to a site

The VIP_Request_Block utility class can be used to block unwanted origin requests to a site such as those from a bot crawling the site or a malicious IP address. A 403 HTTP Response Status Code is returned for requests that are blocked by VIP_Request_Block.

Considerations

  • Requests blocked via VIP_Request_Block are blocked at the origin, not the edge (load balancer). If a request is served from the cache at the edge, it does not reach the origin and cannot be blocked by this class.
  • The VIP_Request_Block utility class is only effective for requests that require WordPress to be loaded (e.g. example.com/blog/post-name). The utility class is ineffective for direct file requests (e.g. example.com/wp-includes/blocks/index.php).
  • Requests that are blocked by a VIP_Request_Block are chargeable requests.
  • Code to block a request should be added to vip-config/vip-config.php to ensure that requests that are intended to be blocked are blocked early.
  • Be careful not to block legitimate traffic (e.g., Googlebot, a reverse proxy, or a CDN). Always take time to confirm that an IP address, User-Agent, or HTTP header is suspicious before blocking it.
  • The VIP_Request_Block class is loaded very early via wp-config.php on VIP Platform environments, before WordPress Core is loaded. When developing on a local development environment the load order might differ from the VIP Platform. It is therefore recommended to wrap statements in an if ( class_exists( 'VIP_Request_Block' ) ) check to avoid errors.

Block by IP

Use the VIP_Request_Block::ip() method to block single IP addresses. IP ranges cannot be blocked with this method. If VIP_Request_Block::ip() is called with an invalid IP address, an error will be logged to the application’s Runtime Logs.

The whois terminal command can be used to query an IP address in order to make a more educated decision about which IP addresses are suitable to be blocked.

Use caution to avoid blocking a reverse proxy IP instead of the client’s IP. Blocking a reverse proxy IP will result in legitimate traffic being blocked.

vip-config/vip-config.php
// Example VIP_Request_Block::ip( string $value );
if ( class_exists( 'VIP_Request_Block' ) ) {
   VIP_Request_Block::ip( '13.37.13.37' );
   VIP_Request_Block::ip( '13.37.13.38' );
}

Block by User-Agent

It can be useful to block by User-Agent in cases where a tool or bot sends suspicious requests from various IP addresses while retaining the same User-Agent. WordPress VIP provides two User-Agent blocking methods for WordPress applications.

When blocking by User-Agents, take care not to block common browser User-Agents, or User-Agents that look similar. It is a best practice to search recent HTTP Request Logs to validate that full, or partial matching blocking will only impact the intended requests. Blocking requests from common browser User-Agents can result in blocking real user requests.

Block by full User-Agent match

The VIP_Request_Block::ua() method can be used when the full text of an unwanted User-Agent is known and uses case-sensitive matching.

vip-config/vip-config.php
// Example VIP_Request_Block::ua( string $user_agent );
if ( class_exists( 'VIP_Request_Block' ) ) {
   VIP_Request_Block::ua( 'SuspiciousBot/1.1' );
   VIP_Request_Block::ua( 'AnotherBot/2.1' );
}

Block by partial User-Agent match

The VIP_Request_Block::ua_partial_match() method can be used when requests from a User-Agent which contains a sub-string need to be blocked. This method does not support regular expressions, and the string matching is case-sensitive.

vip-config/vip-config.php
// Example VIP_Request_Block::ua_partial_match( string $user_agent_substring );
// Will match and block for:
// 	- SuspiciousBot/1.1
// 	- SomewhatSuspiciousBot/1.8 - https://example.com/robot-policy
if ( class_exists( 'VIP_Request_Block' ) ) {
   VIP_Request_Block::ua_partial_match( 'SuspiciousBot/' );
}

Block by partial User-Agent match for AI crawlers

To block requests by artificial intelligence (AI) crawlers for a site’s content, a disallow rule should be set in a site’s robots.txt. An additional measure can be put in place to block AI crawlers by their User-Agents with the VIP_Request_Block::ua_partial_match() method.

This code example demonstrates a custom function that blocks requests from User-Agents of 4 well-known AI crawlers (e.g. OpenAI’s GPTBot) without blocking requests to the site’s robots.txt:

function my_block_ai_user_agents() {
	if ( ! class_exists( 'VIP_Request_Block' ) ) {
		return;
	}

	// Do not block access to robots.txt
	if ( isset( $_SERVER['REQUEST_URI'] )
		&& true === str_contains( $_SERVER['REQUEST_URI'], '/robots.txt' ) ) {
		return;
	}

	// OpenAI GPTBot crawler (https://platform.openai.com/docs/gptbot)
	VIP_Request_Block::ua_partial_match( 'GPTBot/' );

	// OpenAI ChatGPT service (https://platform.openai.com/docs/plugins/bot)
	VIP_Request_Block::ua_partial_match( 'ChatGPT-User/' );

	// Common Crawl crawler (https://commoncrawl.org/faq)
	VIP_Request_Block::ua_partial_match( 'CCBot/' );

	// Google Bard / Gemini crawler (https://developers.google.com/search/docs/crawling-indexing/overview-google-crawlers)
	VIP_Request_Block::ua( 'Google-Extended' );
}

my_block_ai_user_agents();

Only 4 AI crawlers are included in this code example, though far more exist. Customers should research which AI crawler User-Agents should be disallowed for their site and include them in a modified version of this code example.

Block by HTTP header

When blocking by HTTP header:

  • $header is case-insensitive, but $value is case-sensitive.
  • Format the $header string value as it appears when returned by a cURL request or in a browser’s inspector tool console (e.g. Chrome’s Developer Tools).

In this code example, the $header string value is X-My-Header. Because $header is case-insensitive, this value could also be formatted as x-my-header.

vip-config/vip-config.php
// Example VIP_Request_Block::header( string $header, string $value );
if ( class_exists( 'VIP_Request_Block' ) ) {
   VIP_Request_Block::header( 'X-My-Header', 'my-header-value' );
}

Block by an X-ASN HTTP header

An X-ASN header is included in requests made to a site on the VIP Platform, and can be used by PHP. Use this header to block requests by Autonomous System Number (ASN) of an IP address. The ASN value for an IP address can be retrieved with a tool such as MXToolbox.

In this example, IP addresses belonging to the ASN 1234 will be blocked at origin:

// Example VIP_Request_Block::header( 'x-asn', $blocked_asn_number );
if ( class_exists( 'VIP_Request_Block' ) ) {
   VIP_Request_Block::header( 'x-asn', '1234' );
}

Block by other logic

If the above methods are insufficient, a request can be blocked based on custom logic with the VIP_Request_Block::block_and_log() method. This method blocks a request and logs a message to the error_log. The condition in which a given request will be blocked must be defined using custom logic.

Error output can be retrieved with VIP-CLI Runtime Logs.

The following is an example of blocking a request to a specific URL:

vip-config/vip-config.php
// Example VIP_Request_Block::block_and_log( string $value, string $criteria );
if ( isset( $_SERVER['REQUEST_URI'] ) && '/disallowed-path' === $_SERVER['REQUEST_URI'] ) {
   if ( class_exists( 'VIP_Request_Block' ) ) {
     VIP_Request_Block::block_and_log( '/disallowed-path', 'REQUEST URI' );
   }
}

In the example above, requests made to /disallowed-path will produce the following message in the error_log:

VIP Request Block: request was blocked based on "REQUEST_URI" with value of "/disallowed-path"

Last updated: March 01, 2024

Relevant to

  • WordPress