Storage Performance for PHX Cluster 101

Incident Report for Adeptcore, Inc.

Resolved

We have continued monitoring the latency and performance of the storage array and have not identified any further ongoing issues since Saturday afternoon.

The latency normalized around 2:45 PM on Saturday and has stayed consistently normal for over the past forty-four hours. HPE identified a software bug that was the cause of the latency spike on Saturday and applied a patch to remediate the issue.

At this time, we consider the issue to be resolved and will be closing out this incident.

If you require assistance with any of your tenant environments, please contact us via email or phone and we will be happy to assist.

Posted Oct 14, 2024 - 11:15 CDT

Update

As stated in our previous update the spike that just occurred might have caused some web request time outs but it has returned to normal operation at this time. Latency numbers are currently showing back to normal levels (pre incident).

Adeptcloud staff is still going through all virtual environments and infrastructure making sure customers are able to log in. And we are still awaiting datacenter storage engineering to provide the next update.

Posted Oct 12, 2024 - 14:37 CDT

Update

We have just detect a large spike in latency after the last 1.5 hour of normalized latency and are continuing to monitor the issue. We believe the latency spike at this time is related to the bug fix being applied by HPE and datacenter storage engineers and is a temporary spike.

Posted Oct 12, 2024 - 14:28 CDT

Update

We have received an update from datacenter storage engineers that HPE has determined the storage latency issue is being caused by a software bug and they are working on a potential fix at this time. No current ETA has been provided but we are seeing latency numbers normalized to about 10-15% of baseline.

We are continuing to monitor and work with the datacenter on getting this resolved.

We are seeing majority of virtual machines have returned back to normal operations. Adeptcloud staff is rebooting some of the VMs that are experiencing issues from the previous latency and testing the log in process.

Posted Oct 12, 2024 - 13:33 CDT

Update

We are seeing latency for the affected storage array have been reduced greatly over the last 30 minutes and we are seeing all virtual machines returning back operational status.

Adeptcloud staff is confirming functionality at this time and continuing to monitor the situation. While we do not have the all clear from datacenter storage engineers yet, they have been offloading workloads from the affected array.

We still may experience intermitted latency until we have confirmation of the exact issue and are leaving this degraded performance status on our status page.

Posted Oct 12, 2024 - 13:12 CDT

Monitoring

Datacenter storage engineers have engaged with HPE regarding the latency in PHX Cluster 101 and will provide updates as they become available.

We are continuing to monitor the situation and have seen latency slowly returning to normal.

We are aware of virtual machines still experiencing intermittent accessibility and performance issues.

Posted Oct 12, 2024 - 10:28 CDT

Identified

We have been receiving alerts since early this morning of increased latency on one of the arrays, more specifically the PHX Storage Array 101 as well as reports of some virtual machines taking a long time to reboot.

We are attempting to move workloads off the affected storage array to alleviate any issues. This process might take a longer time than usual due to the high read latency.

In the last hour, latency has decreased approximately 20% and we are continuing to monitor it while working on resolving the problem.

We have engaged with the datacenter storage engineers to investigate the issue.

Posted Oct 12, 2024 - 07:22 CDT

This incident affected: Adeptcloud Infrastructure (ACP - Storage).