Skip to content

Backgrounds

How-to Guides

Technical References

Elasticsearch /

Jetpack Search

The easiest way to integrate Jetpack Search is to enable the module.

How to activate the Search module

Enable the Search module via code:

add_filter( 'jetpack_active_modules', 'x_enable_jetpack_search_module', 9999 );

function x_enable_jetpack_search_module( $modules ) {
    if ( ! in_array( 'search', $modules, true ) ) {
        $modules[] = 'search';
    }
    return $modules;
}

Note

This needs to be loaded before the plugins_loaded hook is fired and we recommend doing so via client-mu-plugins.

Your WordPress VIP site is now running Jetpack Search using the Elasticsearch global index. Using the global index reduces overall complexity and resource usage. It also requires minimal setup and little ongoing maintenance. Activating the Jetpack Search module using the code above is the minimum that is required to get Jetpack Search working on a VIP site. Please see below for additional search features and modifications.

If you are not seeing any search results after activating the Jetpack Search module, please contact VIP Support.

Custom Index

Caution

Custom Jetpack Search indexes on the WordPress VIP platform will be deprecated in the near future.

We recommend using Enterprise Search instead.

A VIP Custom Index contains additional content such as meta data. A site needs a VIP Custom Index when any of the following criteria are met:

Activating the search module to use a Custom Index requires two steps:

  1. Add the JETPACK_SEARCH_VIP_INDEX constant and set it to true.
  2. Enable the Search module via code.
define( 'JETPACK_SEARCH_VIP_INDEX', true );

add_filter( 'jetpack_active_modules', 'x_enable_jetpack_search_module', 9999 );

function x_enable_jetpack_search_module( $modules ) {
    if ( ! in_array( 'search', $modules, true ) ) {
        $modules[] = 'search';
    }
    return $modules;
}

Note

This needs to be loaded before the plugins_loaded hook is fired and we recommend doing so via client-mu-plugins.

Once you have added the JETPACK_SEARCH_VIP_INDEX constant and enabled the Search module via code, please contact VIP Support and request a custom index be created for the site.

Please see below for additional search features and modifications.

Filtering Jetpack Search

Jetpack Search provides powerful filtering capabilities to allow users to refine the results across a number of criteria, like post type, category, and date. These are also sometimes referred to as Aggregations (and in earlier versions as Facets).

To enable filters on Jetpack Search, you need to define which filters you want to enable and then integrate the filtering into your site.

To define the filters, use Jetpack_Search::set_filters():

add_action( 'init', 'x_init_jetpack_search_filters' );
function x_init_jetpack_search_filters() {
    Jetpack_Search::instance()->set_filters( [
        'Content Type' => [
            'type'     => 'post_type',
            'count'    => 10,
        ],
        'Categories' => [
            'type'     => 'taxonomy',
            'taxonomy' => 'category',
            'count'    => 10,
        ],
        'Tags' => [
            'type'     => 'taxonomy',
            'taxonomy' => 'post_tag',
            'count'    => 10,
        ],
        'Year' => [
            'type'     => 'date_histogram',
            'field'    => 'post_date',
            'interval' => 'year',
            'count'    => 10,
        ],
        'Month' => [
            'type'     => 'date_histogram',
            'field'    => 'post_date',
            'interval' => 'month',
            'count'    => 10,
        ],
    ] );
}

To integrate with your site, you can use the “Search Filters” widget included in Jetpack. Alternatively, for more control, you can integrate the filters yourself. See the code for the widget as a guideline on how to do so.

Modifying queries

You can modify the Elasticsearch queries generated by the plugin by hooking into the jetpack_search_es_query_args filter and adjusting the args as needed.

For example, here’s an easy way to limit search results to a specific date range:

add_filter( 'jetpack_search_es_query_args', function( $args, $query ) {
    $args['filter']['range']['date'] = array(
        'gte' => '2016-01-01',
        'lte' => '2016-12-31',
    );

    return $args;
}, 10, 2 );

Searching through and indexing post meta

By default, only a small subset of post metadata is currently synced and indexed. To index and be able to query across additional metadata, you can allow them:

add_filter( 'jetpack_sync_post_meta_whitelist', function( $keys ) {
    // Any keys
    $keys[] = 'vip_post_subtitle';
    return $keys;
} );

After the addition, the site’s content will need to be re-synced and re-indexed. This can be done from the WordPress.com Dashboard under Manage > Settings > Site Tools > Manage your connection > initiate a sync manually.

You can then do a term query against meta.[NAME].value where [NAME] is the meta key you want to search through.

Note

If you add the jetpack_search_es_wp_query_args filter to your code you must include all fields you wish to index, as this filter overrides the default filters. For example, to continue to index  title, content, and author, they must be included in the array or they won’t be indexed.

add_filter( 'jetpack_search_es_wp_query_args', function( $args, $query ) {
 $args['query_fields'] = [
 'title',
 'content',
 'author',
 'tag',
 'category',
 'meta.vip_post_subtitle.value',

 // You can also search across taxonomy fields.
 // e.g. searching through terms in a "location" taxonomy
 // 'taxonomy.location.name',
 ];

 return $args;
}, 10, 2 );

You can also filter based on metadata. Here’s an example that excludes sponsored posts from search results:

add_filter( 'jetpack_search_es_query_args', function ( $es_args, $query ) {
	$filter = [
		'not' => [
			'term' => [
				'meta.is_sponsored_post.value' => 1,
			],
		],
	];

	if ( ! isset( $es_args['filter'] ) ) {
		$es_args['filter'] = $filter;
	} elseif ( ! isset( $es_args['filter'] ) ) {
		$es_args['filter']['and'] = [ $filter ];
	} else {
		$es_args['filter']['and'][] = $filter;
	}

	return $es_args;
}, 10, 2 );

Complex data types

While it is possible to allow additional postmeta, there are two types of data that are not indexed:

  1. “private” postmeta (i.e. keys prefixed with _)
  2. non-primitive data types like arrays and objects

All content in the default index is meant to be public, so private metadata is ignored by default. Serialized versions of objects and arrays tend to include additional metadata which doesn’t make sense to index as-is.

To work around these restrictions, you can extract out searchable pieces of content into separate post meta entries on add/update (and remove on delete). The only use for these entries is for indexing and querying.

For example, let’s say we have a post meta entry (x_content_blocks) which stores an object with various content strings and other metadata used to build out our page. As-is, the content within this object will not be searchable. However, we can extract out all the searchable content any time this post meta entry is added or updated. The extracted content can be combined together into a single string and inserted into a new post meta entry called es_x_content_blocks. Similarly, we can remove the meta entry when all the content blocks are deleted.

We can then allow the meta key, proceed with a full re-sync and index, and then be able to search through this field.

Here’s some sample code that you can use as a head start:

add_action( 'updated_post_meta', 'x_maybe_require_search_extraction', 10, 4 );
add_action( 'added_post_meta', 'x_maybe_require_search_extraction', 10, 4 );
add_action( 'deleted_post_meta', 'x_maybe_require_search_extraction', 10, 4 );

function x_maybe_require_search_extraction( $meta_id, $post_id, $meta_key, $meta_value ) {
    if ( ! in_array( $meta_key, [ /* list of content keys go here */ ], true ) ) {
        return;
    }

    global $x_require_search_extraction;
    if ( ! isset( $x_require_search_extraction ) ) {
        $x_require_search_extraction = [];
    }

    if ( ! in_array( $post_id, $x_require_search_extraction, true ) ) {
        $x_require_search_extraction[] = $post_id;
    }
}

add_action( 'shutdown', 'x_extract_searchable_content' );
function x_extract_searchable_content() {
    global $x_require_search_extraction;

    if ( empty( $x_require_search_extraction ) ) {
        return;
    }

    foreach ( $x_require_search_extraction as $post_id ) {
        // grab all postmeta containing our content
        // extract into a string
        // update_postmeta( ... )
    }
}

Changing sort order

By default, search results will be returned ordered by relevancy with some weight given to recency. You can change this sort order by hooking into the jetpack_search_es_query_args filter and modifying the ES query.

This snippet changes the results to be ordered by most recent posts:

add_filter( 'jetpack_search_es_query_args', function( $args, $query ) {
    // Re-sort the results based on their modified date
    $args['sort'] = [
        [
            'modified' => [
                'order' => 'desc'
            ],
        ],
    ];

    return $args;
}, 10, 2 );

You can modify the sort order in any way you choose, including weighting various fields depending on your requirements. For more details on sorting, check out the Elasticsearch documentation.

Last updated: June 30, 2021