Skip to content

Insights & Metrics

The Insights & Metrics panel, located in the application view of the VIP Dashboard, provides insights into the performance, health, and usage of an application.

Access

Prerequisites

To access the Insights & Metrics panel, a user must have at minimum an Org member role or an App read role for that application.

To access the Insights & Metrics panel:

  1. Navigate to the VIP Dashboard for an application.
  2. Select “Performance” from the sidebar navigation at the left of the screen.
  3. Select “Insights & Metrics” from the submenu.

The Insights & Metrics panel is environment-specific (e.g., production, develop). A different environment type can be selected from the dropdown at the upper left of the VIP Dashboard application view.

Interacting with the displayed data

For all data displayed throughout all views of the Insights & Metrics panel:

  • Timestamps are formatted in Coordinated Universal Time (UTC).
  • Users can interact with available settings to modify the time frame and format of the data that is displayed.

Display data for a specific time period

By default, data from the most recent “3 days” (72 hours) is displayed. The user can decrease that window of time to as little as “30 minutes”, and increase it to as much as “30 days”. These options can be selected from the dropdown menu labeled “Time” located in the upper right of the Insights & Metrics panel.

Display data as a chart or a table

All data in the Insights & Metrics panel can be displayed in either a chart format or a table format. The display options for “Chart” or “Table” can be selected from the dropdown menu labeled “Show” located in the upper right of the Insights & Metrics panel.

Data Series legend and toggles

When data is displayed in chart format, a legend is located at the bottom left of the chart. The legend provides a reference for identifying the types of data that are included in the chart, as well as providing the ability to toggle those data types on and off.

For example, an “HTTP Origin Response Codes” chart displays a Data Series legend item that has a green square to the left of “2xx”. This indicates that data in the chart represented by a green line correlates with “2xx” responses.

The user can select and toggle a Data Series legend item to omit or include its correlating data in the chart.

When a data type is omitted, its icon will have a colored outline and no fill.

Example screenshot of a Data Series legend for a “HTTP Origin Response Codes” chart where the data for “2xx” is omitted

Event Types legend and toggles

When data is displayed in chart format, events on an environment related to code deployments and updates to software versions are indicated by Event Types.

Within the chart, a user can hover over an Event Type icon to access more information about the event. The hover action reveals the event’s date, timestamp, and description, and a link to more detailed information.

If Event Types occur within the time period being viewed for a chart, a legend for those Event Types will be located below the chart. A user can select and toggle an Event Type in the legend in order to omit or include them within the chart.

When an Event Type is omitted, its icon will have a colored outline but no fill.

Example screenshot of a data displayed in chart format with event markers occurring within the displayed time range

Link to New Relic dashboard

For environments that have New Relic enabled, a button labeled “View Application in New Relic” will be displayed in the upper right of the panel. Select the button to access the environment’s New Relic dashboard and investigate more deeply into response times for various URL routes. New Relic can also be used to identify anomalies such as slow queries, slow remote requests, or generally slow URL routes.

HTTP

Insights into the performance of responses to HTTP requests that are made to an environment. The data provides insights into the response times for the requests, and the types and amounts of HTTP response status codes that are returned by the origin servers and edge cache servers.

  • HTTP 2xx responses (e.g., 200, 201, 202, 203, 204) indicate that the request succeeded.
  • HTTP 3xx responses (e.g., 300, 301, 302, 303, 304) indicate that the request was redirected.
  • HTTP 4xx responses (e.g., 400, 401, 402, 403, 404) indicate an error on the client side. Most of the time, they do not indicate critical issues affecting a site. For example, if a user requests a page that does not exist, they will see a 404 Not Found error. Users who request access to an application behind IP restriction will see a 403 Forbidden status if their IP address is not allowed. In some cases, a pattern of 404 responses may indicate a missing URL that needs to be redirected.
  • HTTP 5xx responses (e.g., 500, 501, 502, 503, 504) indicate an error on the server side. For example, a 500 response could result if an application’s server encounters a PHP fatal error and can no longer respond.

Possible causes for an increase in 5xx status responses from an application:

  • An unexpected strain on the servers, caused by recently deployed code.
  • A sudden, substantial increase in site traffic.
  • A large number of site requests that are bypassing the page cache. Uncached SQL queries can overload the primary database, resulting in 503 responses being returned.
  • Uncached WordPress functions.
  • Plugins that generate inefficient SQL queries or trigger SQL-intensive cron tasks that are not designed for a site that is running at enterprise scale.

HTTP Origin Response Time

The data in “HTTP Origin Response Time” indicates the duration of time it takes for the origin server to generate a response and send it back to the page cache. P50, P75, and P95 are percentile metrics for comparing the amount of time that was required for some requests versus others made to the same environment to complete.

For example, P75 indicates that 75% of the requests on the environment had a response time lower than the P75 value, while the remaining 25% had a higher response time. This data represents the rate of change in the response times over time.

HTTP Origin Response Codes

Data in “HTTP Origin Response Codes” indicates the types of HTTP response status codes with which the origin server responds to an HTTP request, and the rate at which they are returned.

HTTP Edge Response Codes

Data in “HTTP Edge Response Codes” indicates the types of HTTP response status code with which an edge cache CDN responds to an HTTP request, and the rate at which they are returned.

Resource Usage

Insights into the quality of performance of an environment’s infrastructure-level resources.

PHP Process Status

WordPress only

The number of PHP processes that are currently active versus those that are idle. If the number of active processes is consistently higher than the number of idle processes, it may indicate that the application is under heavy load.

Database

All WordPress environments and qualifying Node.js environments

Insights into the size and health of an environment’s database, and the efficiency of the queries made to the database by the application code.

Database Queries by Type

The data in “Database Queries by Type” represents the rate of change in the number of database queries that are made by the PHP application. High query counts can indicate inefficient queries or a high number of requests being made to the application.

Database Slow Queries

The data in “Database Slow Queries” represents the rate of change in the number of database queries that require more than 750 ms to complete. Spikes (or high readings) can indicate inefficient queries or a high number of requests being made to the application.

Underlying causes for slow queries might not be immediately apparent, and more in-depth debugging may be needed. To investigate a slow query:

Database Size

The value displayed in this tab represents the total size on disk of an environment’s database tables. This value does not include Elasticsearch indexes or binlogs (which are used for replication).

Larger database sizes can cause database sync and database backup processes to require a longer amount of time to complete.

Cache

Insights into an application’s utilization of the VIP infrastructure’s caching layers. It is important that a WordPress or Node.js application is utilizing the caching layers as efficiently as possible even during normal windows of time so that it is more likely to scale smoothly and quickly during a high traffic event.

Page Cache Hit Rate

VIP’s page cache is the first layer of caching encountered by each request made to a WordPress or Node.js environment.

The data in “Page Cache Hit Rate” indicates the rate of change in the page cache hit rate over time for an environment. The page cache hit rate represents the percentage of attempts to fetch data from the page cache where the data is successfully retrieved.

A low cache hit rate is likely to result in more requests being made to the origin server which can negatively affect application performance.

Object Cache Hit Rate

All WordPress environments and qualifying Node.js environments

The WordPress object cache is the second layer of caching encountered by requests to a WordPress environment that pass through the page cache and are routed to the origin servers.

The data in “Object Cache Hit Rate” indicates the percentage of object cache requests where the requested data was successfully retrieved, without needing to retrieve it directly from the database.

The object cache panel in Query Monitor can be used to investigate and improve the efficiency of the use of the object cache by an application.

Object Cache Queries by Type

All WordPress environments and qualifying Node.js environments

The data in “Object Cache Queries by Type” indicates the number of object cache queries that are being made by the PHP application, grouped by query type (i.e. Get, Set, Delete). High query counts can indicate inefficient queries or a high number of requests being made to the application.

Last updated: January 05, 2024

Relevant to

  • Node.js
  • WordPress