Skip to content

Write custom WP-CLI commands at scale

On the VIP Platform, WP-CLI commands run in a container that is separate from a site’s web application, but it shares the Memcached and database containers. It is very important to set protections in place to prevent custom WP-CLI commands from inadvertently affecting a site’s performance.

Avoid memory exhaustion

The WPCOM_VIP_CLI_Command class, included in VIP MU-plugins, provides helper functions for custom WP-CLI commands that need to run over large datasets. The helper functions can be used by extending this class.

If a large amount of data is being processed on a launched site, make sure to prepare the custom WP-CLI command for processing without exhausting memory and overloading the database by using these helper functions.

  • WPCOM_VIP_CLI_Command::vip_inmemory_cleanup() resets the in-memory local WordPress object cache, from the global $wp_object_cache, without affecting Memcached, and resets the in-memory database query log.
    • Consider using this to clear memory after having processed 50-100 posts to avoid interruptions, especially when using get_posts() or WP_Query.
  • WPCOM_VIP_CLI_Command::start_bulk_operation() defers term counting, so that as individual updates are made to the dataset, expensive term counting is not being triggered each update.
    • This is important when the command is issuing many writes or changes to the database.
  • WPCOM_VIP_CLI_Command::end_bulk_operation() restores and triggers term counting. This should be used as a companion to start_bulk_operation() where the pair of functions bookends the command.
  • Use sleep() in between updating records or batches of records to help with loads associated with cache re-validation and data replication.

Paginate results and process in batches

Using a no-LIMIT query can lead to timeout and failure, especially if it takes longer than 30 seconds. Instead, it is recommended to use smaller queries and paging through the results. For example:

class Test_CLI_Command extends WPCOM_VIP_CLI_Command {
	/**
	 * Publishes all pending posts once they have had their metakeys updated.
	 *
	 * Takes a metakey (required) and post category (optional).
	 *
	 * @subcommand update-metakey
	 * @synopsis --meta-key=<meta-key> [--category=<category>] [--dry-run]
	 */
	public function update_metakey( $args, $assoc_args ) {
		// Disable term counting, Elasticsearch indexing, and PushPress.
		$this->start_bulk_operation();

		$posts_per_page = 100;
		$paged = 1;
		$count = 0;

		// Meta key is required, otherwise an error will be returned.
		if ( isset( $assoc_args['meta-key'] ) ) {
			$meta_key = $assoc_args['meta-key'];
		} else {
			// Caution: calling WP_CLI::error stops the execution of the command. Use it only in case you want to stop the execution. Otherwise, use WP_CLI::warning or WP_CLI::line for non-blocking errors.
			WP_CLI::error( 'Must have --meta-key attached.' );
		}

		// Category value is optional.
		if ( isset( $assoc_args['category'] ) ) {
			$cat = $assoc_args['category'];
		} else {
			$cat = '';
		}

		// If --dry-run is not set, then it will default to true. Must set --dry-run explicitly to false to run this command.
		if ( isset( $assoc_args['dry-run'] ) ) {
			// Passing `--dry-run=false` to the command leads to the `false` value being set to string `'false'`, but casting `'false'` to bool produces `true`. Thus the special handling.
			if ( 'false' === $assoc_args['dry-run'] ) {
				$dry_run = false;
			} else {
				$dry_run = (bool) $assoc_args['dry-run'];
			}
		} else {
			$dry_run = true;
		}

		if ( $dry_run ) {
			WP_CLI::line( 'Running in dry-run mode.' );
		} else {
			WP_CLI::line( 'We\'re doing it live!' );
		}

		do {

			$posts = get_posts(
				array(
					'posts_per_page'   => $posts_per_page,
					'paged'            => $paged,
					'category'         => $cat,
					'post_status'      => 'pending',
					'suppress_filters' => 'false',
				)
			);

			foreach ( $posts as $post ) {
				if ( ! $dry_run ) {
					update_post_meta( $post->ID, $meta_key, 'true' );
					wp_update_post( array( 'post_status' => 'publish' ) );
				}
				$count++;
			}

			// Pause.
			WP_CLI::line( 'Pausing for a breath...' );
			sleep( 3 );

			// Free up memory.
			$this->vip_inmemory_cleanup();

			/* At this point, we have to decide whether to increase the value of $paged. In case a value which is being used for querying the posts (like post_status in our example) is being changed via the command, we should keep the WP_Query starting from the beginning in every iteration.
			 * If the any value used for querying the posts is not being changed, then we need to update the value in order to walk through all the posts. */
			// $paged++;
		} while ( count( $posts ) );

		if ( false === $dry_run ) {
			WP_CLI::success( sprintf( '%d posts have successfully been published and had their metakeys updated.', $count ) );
		} else {
			WP_CLI::success( sprintf( '%d posts will be published and have their metakeys updated.', $count ) );
		}

		// Trigger a term count as well as trigger bulk indexing of Elasticsearch site.
		$this->end_bulk_operation();
	}

	/**
	 * Updates terms in that taxonomy by removing the "test-" prefix.
	 *
	 * Takes a taxonomy (required).
	 *
	 * @subcommand update-terms
	 * @synopsis --taxonomy=<taxonomy> [--dry_run]
	 */
	public function update_terms( $args, $assoc_args ) {
		$count = 0;

		// Disable term counting, Elasticsearch indexing, and PushPress.
		$this->start_bulk_operation();

		// Taxonomy value is required, otherwise an error will be returned.
		if ( isset( $assoc_args['taxonomy'] ) ) {
			$taxonomy = $assoc_args['taxonomy'];
		} else {
			WP_CLI::error( 'Must have a --taxonomy attached.' );
		}

		if ( isset( $assoc_args['dry-run'] ) ) {
			if ( 'false' === $assoc_args['dry-run'] ) {
				$dry_run = false;
			} else {
				$dry_run = (bool) $assoc_args['dry-run'];
			}
		} else {
			$dry_run = true;
		}

		if ( $dry_run ) {
			WP_CLI::line( 'Running in dry-run mode.' );
		} else {
			WP_CLI::line( 'We\'re doing it live!' );
		}

		$terms = get_terms( array( 'taxonomy' => $taxonomy ) );

		foreach ( $terms as $term ) {
			if ( ! $dry_run ) {
				$args = array(
					'name' => str_replace( 'test ', '', $term->name ),
					'slug' => str_replace( 'test-', '', $term->slug ),
				);
				wp_update_term( $term->term_id, $term->taxonomy, $args );
			}
			$count++;
		}

		// Trigger a term count as well as trigger bulk indexing of Elasticsearch site.
		$this->end_bulk_operation();

		if ( false === $dry_run ) {
			WP_CLI::success( sprintf( '%d terms were updated.', $count ) );
		} else {
			WP_CLI::success( sprintf( '%d terms will be updated.', $count ) );
		}
	}
}

WP_CLI::add_command( 'test-command', 'Test_CLI_Command' );

Design for restarts

All CLI commands should be designed to be prepared for restart. This is particularly true for CLI commands that deal with a large number of posts—or other long-running actions.

A CLI command could stop mid-execution for a variety of reasons. By default, the platform infrastructure will restart an interrupted CLI command as soon as possible. However, unless logic is built into a command’s code to restart from the point at which it stopped, the command will restart from the beginning.

Commands should either be idempotent (meaning they can safely be run multiple times) or have an option built in that allows the command to start from a specific point by using offset and limit arguments.

Last updated: December 26, 2023

Relevant to

  • WordPress