Javascript required
Skip to content Skip to sidebar Skip to footer

Designing a Dashboard System That Accelerates Performance

Performance Dashboard gives you visibility into the performance of the entire Google Cloud network, as well as to the performance of your project's resources.

In the project performance view, Performance Dashboard shows packet loss and latency metrics (in summary charts and heatmap views) for zones where you have Compute Engine virtual machine (VM) instances. It provides current data, as well as metrics for the past six weeks. For example, your project has a Virtual Private Cloud (VPC) network with VMs in zones A and B. In such a case, Performance Dashboard provides data about packet loss and latency details of your project between those two zones. To view project metrics, click View project performance at the top of the Performance Dashboard page. For more examples and details about what is measured, see Metrics.

In the Google Cloud performance view, Performance Dashboard shows packet loss and latency metrics across all of Google Cloud. These metrics can help you understand whether issues evident in the per-project dashboard are unique to your project. The global metrics can also help you plan future deployments. To view global metrics, click View performance for all of Google Cloud at the top of the Performance Dashboard page.

With these performance-monitoring capabilities, you can distinguish between a problem in your application and a problem in the underlying Google Cloud network. You can also debug historical network performance problems.

Performance Dashboard also exports data to Cloud Monitoring. You can use Monitoring to query the data and get access to additional information. For details, see Viewing monitoring metrics.

Google Cloud performance view

In this view, Performance Dashboard presents the overall Google Cloud metrics for each possible Google Cloud zone pair. It displays the current status of the Google Cloud network and lets you compare performance across all of Google Cloud to the performance observed in your projects.

The Google Cloud performance view shows time series data for up to 50 zone pairs for the selected time window, which by default is one hour.

You can view network performance for any Google Cloud zone pair, even if your project is not deployed in those zones. You can view the performance at both the region level and the zone level. A summary time series chart shows up to 50 zone pairs with the highest aggregated VM-to-VM packet loss or latency across all of Google Cloud.

To view all the Google Cloud performance metrics in your project's Performance Dashboard, you can hold the pointer over the specific zone pairs. To view the performance of the entire Google Cloud network in Performance Dashboard, you can click View performance for all of Google Cloud.

Metrics

Performance Dashboard provides two kinds of metrics: packet loss and latency (round-trip time, or RTT). To get packet loss metrics for your project, you need a sufficient number of VMs in the project. To get latency metrics, you need a sufficient amount of traffic. Other than that, Performance Dashboard requires no setup.

The following sections describe both metrics in more detail.

Packet loss

Packet loss metrics show the results of active probing between the following:

  • VMs within a single VPC network.

  • VMs in peered VPC networks, when one or both networks sit within your project. If the peered networks sit in different projects, packet loss is visible in the destination project.

  • VMs in a Shared VPC network that is used by your project. Packet loss between two projects that use a Shared VPC network is visible in the destination service project.

For example, suppose project A includes two VPC networks: network A, which has VMs only in zone A, and network M, which has VMs only in zone M. If those two networks are peered, project A's Performance Dashboard shows packet loss data for the A/M zone pair. If the networks are not peered, Performance Dashboard does not show packet loss data for that zone pair.

On the other hand, suppose that these two networks are not in the same project. That is, suppose network A is part of project A, and network M is part of project M. When the networks are peered, project M's Performance Dashboard shows packet loss data for situations where zone M is the destination zone. Conversely, when zone A is the destination zone, the packet loss data is visible only to project A. If the networks are not peered, neither project's Performance Dashboard shows packet loss data for the zone pair.

The data gathered through all probes is aggregated in Performance Dashboard. That is, Performance Dashboard does not let you isolate data about intra-project packet loss versus other types (such as packet loss related to a peered VPC network in another project). However, you can use Monitoring to drill down and see results that are more granular. For details, see Viewing monitoring metrics.

Performance Dashboard does not send probes over Cloud VPN connections.

Methodology

Performance Dashboard runs workers on the physical hosts that house your VMs. These workers insert and receive probe packets that run on the same network as your traffic. Because the workers run on the physical hosts and not on your VMs, these workers do not consume VM resources, and the traffic is not visible on your VMs.

The probes cover the entire mesh of VMs that can communicate with each other, which is not necessarily the same as your traffic pattern. Therefore, you might see indications of packet loss in Performance Dashboard, but no evidence of packet loss in your application.

For all probed VMs, we try to access the VM both by using its internal IP address and external IP address (if one exists). The probes do not leave Google Cloud, but by using external IP addresses, Performance Dashboard can cover part of the path that would be used by external traffic, such as traffic coming from the internet.

Packet loss for internal IP addresses is measured by using UDP packets, and packet loss for external IP addresses is measured by using TCP packets.

Metric availability and confidence levels

Performance Dashboard probes a subset of all VM-VM pairs in the network. The data gathered is then used to estimate the packet loss that you might experience. Google's confidence in the data depends on the probing rate, and the probing rate depends on the number of VMs that you have in each zone, as well as the number of zones where you have VMs deployed. For example, having 10 VMs in two zones generates more confidence than having 10 VMs in 10 zones.

All VMs, including those created by Google Kubernetes Engine (GKE), count toward the total number of VMs.

The varying levels of confidence are described in the following table. Lower levels of confidence are flagged in the heatmap with an asterisk or N/A.

Level Required number of VMs in each zone What Performance Dashboard shows on the heatmap
95% confidence 10 VMs x the number of zones in the project. For example, if you have 12 zones in your project, you must have 120 VMs in each zone. A measurement without any additional notations
90% confidence 2.5 VMs x the number of zones in the project. For example, if you have 12 zones in your project, you must have 30 VMs in each zone. A measurement without any additional notations
Low confidence A measurement with an asterisk
Not enough probes to have meaningful data N/A

The Google Cloud packet loss metrics are always available. The per-project packet loss metrics are available only if there are more than 200 probes per minute for the zone pair. An asterisk (*) is displayed if there are 200 to 400 probes per minute.

Latency

Latency metrics are measured by using actual customer traffic between the following:

  • VMs within a single VPC network.

  • VMs between peered VPC networks, if the networks sit in the same project.

Additionally, the Performance Dashboard for a service project within a Shared VPC network shows data only for the zones within the service project. That is, suppose a VM in zone A and service project A uses the host project to communicate with a VM in zone B and service project B. Measurements about that traffic are not available to either service project or the host project.

Performance Dashboard does not show latency data for the following:

  • Traffic between peered VPC networks, if one VPC network is in a different project.

  • Traffic sent through Cloud VPN connections.

Methodology

Latency is measured by using TCP packets.

Based on a sample of your actual traffic, latency is calculated as the time that elapses between sending a TCP sequence number (SEQ) and receiving a corresponding ACK that contains the network RTT and TCP stack-related delay. The UI shows latency as the median of all relevant measurements.

The latency metric is based on the same data source and sampling methodology as VPC Flow Logs.

Metric availability

The Google Cloud latency metric is always available. The per-project latency metric is available only if TCP traffic is around 1000 packets per minute or higher.

Metrics summary table

The following table summarizes the probing methods and protocols used for reporting packet loss and latency metrics.

Packet loss Latency
Probing method Active probing (synthetic VM traffic) Passive probing (actual VM traffic)
Protocol UDP (internal IP address), TCP (external IP address) TCP (internal/external IP addresses)

Although the preceding table references external IP addresses, Performance Dashboard shows data about only internal traffic. To find aggregated data about traffic that your VMs receive from external sources, you can use Cloud Monitoring.

Permissions

To access Performance Dashboard data, either through the Cloud Console or through Monitoring, you must have the monitoring.timeSeries.list permission. This permission is included in the Monitoring roles listed in the following table.

Role name Role ID
Monitoring Viewer roles/monitoring.viewer
Monitoring Editor roles/monitoring.editor
Monitoring Admin roles/monitoring.admin

For information about other roles that include the monitoring.timeSeries.list permission, see Understanding roles.

Data visualization when viewing historical data

When you view data for a time period of one day or more, the chart provides additional data in a lighter color (a halo) around the primary data. The lighter color that surrounds the line shows the range of values, from the lowest to the highest, that were aggregated to draw the primary line.

Aggregated historical data.
Aggregated historical data (click to enlarge)

What's next

  • View Performance Dashboard use cases

  • View Performance Dashboard metrics in the Cloud Console

  • View Performance Dashboard metrics in Cloud Monitoring

  • Troubleshoot Performance Dashboard

  • Learn more about packet sampling

Designing a Dashboard System That Accelerates Performance

Source: https://cloud.google.com/network-intelligence-center/docs/performance-dashboard/concepts/overview