robots.txt
WordPress environments that are accessible at a convenience domain have a hard-coded /robots.txt output that returns:
User-agent: * Disallow: /
Requests to any URLs on the environment will also return an x-robots-tag: noindex, nofollow
header. These settings are intended to prevent search engines from indexing content hosted on non-production sites, or unlaunched production sites.
Limitations
In order to modify the output of /robots.txt, the environment (production or non-production) must be accessible at a custom domain that is set as the primary domain. Replace the convenience domain with a custom primary domain by completing the steps to launch a WordPress single site, or by launching the main site (ID 1) of a WordPress multisite.
Modify the robots.txt file
To modify the /robots.txt file hook into the do_robotstxt
action, or filter the output by hooking into robots_txt
filter. In this code example, a specific directory is marked as nofollow
:
function my_disallow_directory() {
echo "User-agent: *" . PHP_EOL;
echo "Disallow: /path/to/your/directory/" . PHP_EOL;
}
add_action( 'do_robotstxt', 'my_disallow_directory' );
Caching
The /robots.txt file is cached for long periods of time. In order to force the cache to clear after any changes made to the file, go to Settings > Reading
within WP-Admin and toggle the Search engine visibility
setting, saving the changes each time the setting is changed.

The page cache for the /robots.txt file can also be flushed using the wp vip cache purge-url
WP-CLI command which is available on WordPress environments.
Last updated: August 03, 2023