Rate limiting is a measure we’ve put in place to protect the search services from spikes in requests that may cause instability. It’s similar to other limits on resources, but there are layers: an application rate limit can be configured, provided it doesn’t exceed a maximum.
If Elasticsearch (ES) queries are being rate limited, a portion of the requests will go to the database instead. Depending on the nature of the queries, this may not be optimal.
Rate limiting can easily occur when many pages are making the same queries directly to Elasticsearch rather than making use of caching.
Addressing rate limiting when it happens
The rate limit is a protective measure, so when it triggers, it generally suggests the queries themselves could be reduced. Optimally, normal peak traffic on a site does not come close to the per-application default limit.
The first efforts to manage a rate limit situation is to look at the queries for opportunities to reduce the frequency. This may be realized by higher level changes such as ensuring that each page is being cached in page cache at the edge, or normalization and object caching of the search results themselves.
One standard recommended option for reducing the frequency of ES requests is to implement caching layers. Use the WordPress object cache to store successful results for a period of time, and use those instead of always making a request to VIP Search. We have sample code that works well with both DB queries and ES queries, and there are many good techniques that can be used to provide caching and invalidation.
Because WP_Query works with object cache to cache the results of each query, and Enterprise Search is intercepting just the MySQL portion, you should find that repeated queries within a short period of time are hitting memcached rather than ES. This is normal, and supports scaling and avoids repeated Elasticsearch (ES) calls that trigger rate limiting.
However, if there are query parameters that are constantly changing, especially date or time parameters, you may find that the Debug Bar or Search Dev Tools (coming soon) is showing ES calls when you keep reloading a page. Because the parameters are combined together in a hash to access the cache, changing times will miss the cache and result in a new query every time. This can trigger rate limits. You can use the Search Dev Tools (coming soon) to see what the actual parameters being sent to ES are, and confirm if they are changing.
Regardless of whether queries are sent to Elasticsearch or MySQL, continually changing query arguments result in excess resource use and contribute to poor performance. This can be resolved by making changes to the PHP code.
One way to account for this is to round the dates up to the closest day, hour, or minute. Your results may contain more items than needed; for example, if you have an events page and you wish to show only events starting in the future, round down to the current hour (or day), and then as the page is built, use the actual start times from the post data or post metadata to decide whether to show an event.
Avoid using very explicit times (down to the minute or second) in your query arguments.