Skip to content

How-to Guides

Technical References

Elasticsearch /

Integrating Jetpack Search

The easiest way to integrate Jetpack Search is to enable the module.

How to activate the Search module

Activating the search module requires two steps:

  1. Add the JETPACK_SEARCH_VIP_INDEX constant and set it to true.
  2. Enable the Search module via code.
define( 'JETPACK_SEARCH_VIP_INDEX', true );

add_filter( 'jetpack_active_modules', 'x_enable_jetpack_search_module', 9999 );

function x_enable_jetpack_search_module( $modules ) {
    if ( ! in_array( 'search', $modules, true ) ) {
        $modules[] = 'search';
    }
    return $modules;
}

Note

This needs to be loaded before the plugins_loaded hook is fired and we recommend doing so via client-mu-plugins.

Filtering Search

Jetpack Search provides powerful filtering capabilities to allow users to refine the results across a number of criteria, like post type, category, and date. These are also sometimes referred to as Aggregations (and in earlier versions as Facets).

To enable filters on Jetpack Search, you need to define which filters you want to enable and then integrate the filtering into your site.

To define the filters, use Jetpack_Search::set_filters():

add_action( 'init', 'x_init_jetpack_search_filters' );
function x_init_jetpack_search_filters() {
    Jetpack_Search::instance()->set_filters( [
        'Content Type' => [
            'type'     => 'post_type',
            'count'    => 10,
        ],
        'Categories' => [
            'type'     => 'taxonomy',
            'taxonomy' => 'category',
            'count'    => 10,
        ],
        'Tags' => [
            'type'     => 'taxonomy',
            'taxonomy' => 'post_tag',
            'count'    => 10,
        ],
        'Year' => [
            'type'     => 'date_histogram',
            'field'    => 'post_date',
            'interval' => 'year',
            'count'    => 10,
        ],
        'Month' => [
            'type'     => 'date_histogram',
            'field'    => 'post_date',
            'interval' => 'month',
            'count'    => 10,
        ],
    ] );
}

To integrate with your site, you can use the “Search Filters” widget included in Jetpack. Alternatively, for more control, you can integrate the filters yourself. See the code for the widget as a guideline on how to do so.

Modifying Queries

You can modify the Elasticsearch queries generated by the plugin by hooking into the jetpack_search_es_query_args filter and adjusting the args as needed.

For example, here’s an easy way to limit search results to a specific date range:

add_filter( 'jetpack_search_es_query_args', function( $args, $query ) {
    $args['filter']['range']['date'] = array(
        'gte' => '2016-01-01',
        'lte' => '2016-12-31',
    );

    return $args;
}, 10, 2 );

Searching Through And Indexing Post Meta

By default, only a small subset of post metadata is currently synced and indexed. To index and be able to query across additional metadata, you can allow them:

add_filter( 'jetpack_sync_post_meta_whitelist', function( $keys ) {
    // Any keys
    $keys[] = 'vip_post_subtitle';
    return $keys;
} );

After the addition, the site’s content will need to be re-synced and re-indexed. This can be done from the WordPress.com Dashboard under Manage > Settings > Site Tools > Manage your connection > initiate a sync manually.

You can then do a term query against meta.[NAME].value where [NAME] is the meta key you want to search through.

Note

If you add the jetpack_search_es_wp_query_args filter to your code you must include all fields you wish to index as this filter overrides the default filters. For example, to continue to index  title, content, and author they must be included in the array or they won’t be indexed.

add_filter( 'jetpack_search_es_wp_query_args', function( $args, $query ) {
 $args['query_fields'] = [
 'title',
 'content',
 'author',
 'tag',
 'category',
 'meta.vip_post_subtitle.value',

 // You can also search across taxonomy fields.
 // e.g. searching through terms in a "location" taxonomy
 // 'taxonomy.location.name',
 ];

 return $args;
}, 10, 2 );

You can also filter based on metadata. Here’s an example that excludes sponsored posts from search results:

add_filter( 'jetpack_search_es_query_args', function ( $es_args, $query ) {
	$filter = [
		'not' => [
			'term' => [
				'meta.is_sponsored_post.value' => 1,
			],
		],
	];

	if ( ! isset( $es_args['filter'] ) ) {
		$es_args['filter'] = $filter;
	} elseif ( ! isset( $es_args['filter'] ) ) {
		$es_args['filter']['and'] = [ $filter ];
	} else {
		$es_args['filter']['and'][] = $filter;
	}

	return $es_args;
}, 10, 2 );

Complex Data Types

While it is possible to allow additional postmeta, there are two types of data that are not indexed:

  1. “private” postmeta (i.e. keys prefixed with _)
  2. non-primitive data types like arrays and objects

All content in the default index is meant to be public, so private metadata is ignored by default. Serialized versions of objects and arrays tend to include additional metadata which doesn’t make sense to index as-is.

To work around these restrictions, you can extract out searchable pieces of content into separate post meta entries on add/update (and remove on delete). The only use for these entries is for indexing and querying.

For example, let’s say we have a post meta entry (x_content_blocks) which stores an object with various content strings and other metadata used to build out our page. As-is, the content within this object will not be searchable. However, we can extract out all the searchable content any time this post meta entry is added or updated. The extracted content can be combined together into a single string and inserted into a new post meta entry called es_x_content_blocks. Similarly, we can remove the meta entry when all the content blocks are deleted.

We can then allow the meta key, proceed with a full re-sync and index, and then be able to search through this field.

Here’s some sample code that you can use as a headstart:

add_action( 'updated_post_meta', 'x_maybe_require_search_extraction', 10, 4 );
add_action( 'added_post_meta', 'x_maybe_require_search_extraction', 10, 4 );
add_action( 'deleted_post_meta', 'x_maybe_require_search_extraction', 10, 4 );

function x_maybe_require_search_extraction( $meta_id, $post_id, $meta_key, $meta_value ) {
    if ( ! in_array( $meta_key, [ /* list of content keys go here */ ], true ) ) {
        return;
    }

    global $x_require_search_extraction;
    if ( ! isset( $x_require_search_extraction ) ) {
        $x_require_search_extraction = [];
    }

    if ( ! in_array( $post_id, $x_require_search_extraction, true ) ) {
        $x_require_search_extraction[] = $post_id;
    }
}

add_action( 'shutdown', 'x_extract_searchable_content' );
function x_extract_searchable_content() {
    global $x_require_search_extraction;

    if ( empty( $x_require_search_extraction ) ) {
        return;
    }

    foreach ( $x_require_search_extraction as $post_id ) {
        // grab all postmeta containing our content
        // extract into a string
        // update_postmeta( ... )
    }
}

Changing Sort Order

By default, search results will be returned ordered by relevancy with some weight given to recency. You can change this sort order by hooking into the jetpack_search_es_query_args filter and modifying the ES query.

This snippet changes the results to be ordered by most recent posts:

add_filter( 'jetpack_search_es_query_args', function( $args, $query ) {
    // Re-sort the results based on their modified date
    $args['sort'] = [
        [
            'modified' => [
                'order' => 'desc'
            ],
        ],
    ];

    return $args;
}, 10, 2 );

You can modify the sort order in any you choose including weighting various fields depending on your requirements. For more details on sorting, check out the Elasticsearch documentation.

Last updated: October 20, 2020