The SIOS PERC Dashboard™ is specifically designed to give the user a single point of view into their environment. It consists of the PERC (Performance, Efficiency, Reliability, Capacity) Issues section and the PERC Matrix section. The dashboard gives a high level snapshot of performance, resource utilization, service levels, and availability across compute, storage, and network. It can also serve as a starting point to diagnose any problem or anomaly in the infrastructure. The user can drill down on individual components and panels within the dashboard to get a more detailed view. The user can also change the time selection filter to show issues, statistics, and trends for the time period selected.

The PERC Dashboard displays information at the aggregate level as well as highlights specific problems or warnings in VM(s), host(s), and network or storage components. Leveraging the unique machine learning algorithm for continuous optimization, it will capture current states, display historical data, forecast key trends, and help diagnose and identify specific problems in the infrastructure.

PERC Overview Section

Performance

  • Issues In Progress
    • Critical – Displays the total count of all the Critical level Performance issues that are in progress for the time period selected.
      Warning – Displays the total count of all the Warning level Performance issues that are in progress for the time period selected.
    • Information – Displays the total count of all the Informational level Performance issues that are in progress for the time period selected.
  • All Performance Issues – The All Performance Issues chart shows the trend of all issues which started in the current or previous time period, and which were in progress during the time period selected. The Total Issues number shows the total number of issues that were in progress during the time period selected. The percent change number shows how much the total number of issues has changed from the previous time period.

Efficiency

  • Issues In Progress
    • Critical
    • Warning
    • Information – This is the cumulative number of Efficiency > Idle VMs and VM Snapshots. When it is selected it will drill down to the PERC Issue List.
  • Example: The number of Storage Acceleration Candidates for the Environment based on the time filter selected. When it is selected it will drill down to the Storage Acceleration Dashboard.
  • All Efficiency Issues – The All Efficiency Issues chart shows the trend of all issues which started in the current or previous time period, and which were in progress during the time period selected. The Total Issues number shows the total number of issues that were in progress during the time period selected. The percent change number shows how much the total number of issues has changed from the previous time period.

Reliability

  • Issues In Progress
    • Critical
    • Warning – Displays the total count of all the Warning level Reliability issues that are in progress for the time period selected.
    • Information – Displays the total count of all the Informational level Reliability issues that are in progress for the time period selected.
  • All Reliability Issues – The All Reliability Issues chart shows the trend of all issues which started in the current or previous time period, and which were in progress during the time period selected. The Total Issues number shows the total number of issues that were in progress during the time period selected. The percent change number shows how much the total number of issues has changed from the previous time period.

Capacity

  • Issues In Progress
    • Critical – Displays the total count of all the Critical level Capacity issues that are in progress for the time period selected.
    • Warning – Displays the total count of all the Warning level Capacity issues that are in progress for the time period selected.
    • Information – Displays the total count of all the Informational level Capacity issues that are in progress for the time period selected.
  • All Capacity Issues – The All Capacity Issues chart shows the trend of all issues which started in the current or previous time period, and which were in progress during the time period selected. The Total Issues number shows the total number of issues that were in progress during the time period selected. The percent change number shows how much the total number of issues has changed from the previous time period.

PERC Matrix Section

Display by Environment

Users can select all Environments or a specific Environment to display.

Display Time Range:

Users can select from the following time ranges to display.

  • Last 24 hours
  • Last 7 days
  • Last 30 days

The data displayed is derived from the real-time stats collected. The data is then sub-sampled/averaged out to display the charts as well as derive any trends and metrics.

Dashboard Components

The PERC Dashboard has four columns (Performance, Efficiency, Reliability, Capacity), each of which consists of (Compute, Storage, Network) as shown in the diagram below.

In general, the columns are designed to indicate data for the following:

  • Performance – This column contains information such as utilization rates, latency, I/O, and throughput. Most of the data is averaged out across the specific filter level set by the user.
  • Efficiency – This section shows waste, density, and other utilizations that help indicate the overall efficiency of the system. This is also aggregated at the specific filter level set by the user.
  • Reliability – This section contains information around service levels and availability. The data is aggregated across the specific filter level set by the user.
  • Capacity – This is the aggregate capacity (compute, storage, network) at the specific filter level set by the user.

These sections contain panels which consist of trend lines and numbers which outline specific data and trends (as shown below).

A user can drill down on the panels for a more detailed view including graphs and other useful information.

Each section is described in more detail below.

Compute Section

Compute Performance

  • Host CPU Utilization – The Host CPU Utilization chart shows the average CPU utilization across the environments selected. The Variance shows the variance of CPU utilization averaged across the hosts in the environments selected. The percent change shows how much the CPU utilization variance has changed from the previous time period.

  • CPU Ready Time – The CPU Ready Time chart shows the average number of milliseconds (ms) the VMs were ready but were not scheduled to run across the environments selected. The ms number shows the average time the VMs were ready. The percent change shows how much the average ready time changed from the previous time period.

  • Host Memory Utilization – The Host Memory Utilization chart shows the average host memory utilization across the environments selected. The Variance number shows the variance of memory utilization averaged across the hosts in the environments selected. The percent change shows how much the memory utilization variance has changed from the previous time period.

  • Memory Ballooning – The Memory Ballooning chart shows how much virtual memory (RAM) is being reclaimed from the VMs to address over-allocation across the environments selected. The MB number shows the average amount of reclaimed virtual memory. The percent change shows how much the average reclaimed virtual memory changed from the previous time period.

  • Memory Swapping – The Memory Swapping chart shows how much physical host memory (RAM) is being overutilized across the environments selected. The MB number shows the average physical host memory overutilization. The percent change shows how much the average physical host memory utilization changed from the previous time period.

Compute Efficiency

  • Waste – Compute Cost – The Waste – Compute Cost chart shows the potential monthly savings that can be achieved by reducing compute waste across the environments selected. The savings is calculated based on the time period selected and the Cost policy (See Manage > Policies: Cost). The Cost number shows the current potential savings that can be achieved. The percent change shows how much the potential savings changed from the previous time period.

  • Waste – vCPUs – The Waste – vCPUs chart shows the number of idle virtual CPUs that can be reclaimed across the environments selected. The idle virtual CPUs are identified based on the time period selected, the Idle VMs Policy (See Manage > Policies: Idle VMs), and the Oversized VMs Policy (See Manage > Policies: Oversized VMs). The vCPUs number shows how many idle virtual CPUs can be reclaimed. The percent change shows how much the number of idle virtual CPUs changed from the previous time period.

  • Waste – vMemory – The Waste – vMemory chart shows the amount of idle virtual memory that can be reclaimed across the environments selected. The idle memory is identified based on the time period selected, the Idle VMs Policy (See Manage > Policies: Idle VMs), and the Oversized VMs Policy (See Manage > Policies: Oversized VMs). The GB number shows the current amount of idle virtual memory that can be reclaimed. The percent change shows how much the amount of idle virtual memory changed from the previous time period.

  • Avg VMs per Host – The Avg VMs per Host chart shows the average number of virtual machines per host across the environments selected. The Avg VMs number shows the average number of virtual machines per host. The percent change shows how much the average number of virtual machines per host has changed from the previous time period.

Compute Reliability

  • Host Uptime – The Host Uptime chart shows the percentage of time that the hosts were up across the environments selected. The Uptime (%) number shows the average uptime. The percent change shows how much the average uptime changed from the previous time period.

  • Live Migrations – The Live Migrations chart shows the number of live migrations that occurred across the environments selected. The Total Number shows the total number of live migrations that occurred in the time period selected. The percent change shows how much the total number of live migrations changed from the previous time period.

  • Host Failure Tolerance – The Host Failure Tolerance chart shows the total number of hosts, across all clusters in the indicated environment(s), that can fail and still leave those clusters with enough resources to maintain their current VM workloads. The Total Number shows the total number of host failures that can be sustained across all clusters in the indicated environment(s) in the time period selected. The percentage change shows how much the total host failure tolerance changed from the previous time period.

Compute Capacity – This section summarizes the aggregate compute capacity for the currently filtered level.

  • Total CPU Utilization – The Total CPU Utilization chart shows the amount of CPU (GHz) utilized across the environments selected. The Percent number shows the current percentage of CPU utilized. The percent change shows how much the percentage of CPU utilized changed from the previous time period.

  • Total Memory Utilization – The Total Memory Utilization chart shows the amount of physical memory (GB) utilized across the environments selected. The Percent number shows the current percentage of physical memory utilized. The percent change shows how much the percentage of physical memory utilized changed from the previous time period.

Storage Section

Storage Performance

  • Average Throughput – The Average Throughput chart shows the average storage throughput (KBps) achieved across the environments selected. The KBps number shows the average storage throughput. The percent change shows how much the average storage throughput changed from the previous time period.

  • Average IOPS – The Average IOPS chart shows the average storage IOPS across the environments selected. The IOPS number shows the average storage IOPS (I/O Operations Per Second). The percent change shows how much the average storage IOPS changed from the previous time period.

  • Average Latency – The Average Latency chart shows the average storage latency across the environments selected. The ms number shows the average storage latency. The percent change shows how much the average storage latency changed from the previous time period.

Storage Efficiency

  • Waste – Storage Cost – The Waste – Storage Cost chart shows the potential monthly savings that can be achieved by reducing storage waste across the environments selected. The savings is calculated based on the Cost policy (See Manage > Policies: Cost). The Cost number shows the current potential savings that can be achieved. The percent change shows how much the potential savings changed from the previous time period.

  • Waste – Storage Space – The Waste – Storage Space chart shows the amount of storage that can be reclaimed by deleting unused snapshots across the environments selected. The GB number shows the current amount of storage that can be reclaimed. The percent change shows how much the amount of storage that can be reclaimed changed from the previous time period.

  • Storage Acceleration Candidates – The Storage Acceleration Candidates chart shows the number of VMs identified as storage acceleration candidates across the environments selected. The VMs were identified based on the time period selected and the Storage Acceleration policy (See Manage > Policies: Storage Acceleration). The VMs number shows the current number of VM candidates identified. The percent change shows how much the number of candidates changed from the previous time period.

Storage Reliability

Storage Capacity

  • Total Storage Utilization – The Total Storage Utilization chart shows the amount of storage (GB) utilized across the environments selected. The Percent number shows the current percentage of storage utilized. The percent change shows how much the percentage of storage utilized changed from the previous time period.

  • # of Days To Out Of Capacity – The # of Days To Out Of Capacity chart shows the minimum number of days in which the storage capacity threshold is forecast to be reached across all datastores for the environments selected. The # of Days value shows the most recent minimum forecast for storage capacity. The percent change shows how much the minimum forecast changed from the previous time period.

Network Section

Network Performance

  • Average Throughput – The Average Throughput chart shows the average network throughput (KBps) across the environments selected. The KBps number shows the average network throughput. The percent change shows how much the average network throughput changed from the previous time period.

Network Efficiency

Network Reliability

  • Dropped Packets – The Dropped Packets chart shows the number of dropped network packets across the environments selected. The Total Number shows the total number of dropped packets that occurred at the host level in the time period selected. The percent change shows how much the total number of dropped packets changed from the previous time period.

Network Capacity

SIOS PERC Dashboard™

Feedback

Was this helpful?

Yes No
You indicated this topic was not helpful to you ...
Could you please leave a comment telling us why? Thank you!
Thanks for your feedback.

Post your comment on this topic.

Post Comment