Monitoring - we have received confirmation that the latency issue has been resolved, we are leaving this in monitoring status for 24 hours, we have requested a root cause analysis from HPE and are awaiting confirmation of all details related to this event, we will post another update within 24 hours for our plan of action
Jun 06, 2023 - 16:31 CDT
Update - all datastores on nimble storage 2 have returned to pre-issue latency numbers we are awaiting confirmation of root cause as well as confirmation that the issue will not reoccur
Jun 06, 2023 - 16:00 CDT
Update - We are currently detecting another spike and are still working with datacenter engineers and vendor (HPE) on a resolution.
Jun 06, 2023 - 15:41 CDT
Update - we are seeing latency has been stable since the last spike, DC Engineering and Vendor (HPE) are working on resolving the issue with Nimble San 2, majority of VMs should be fine, we are noticing if a VM froze up due to a storage latency spike that CPU utilization goes up while it is trying to catch up on the missed I/O
Jun 06, 2023 - 15:27 CDT
Update - datacenter engineering has escalated the issue to the vendor "HPE" engineering team and both are actively working on resolving the issue.
Jun 06, 2023 - 15:03 CDT
Update - We are seeing latency spikes continue to drop and back to normal utilization as of right now. It seems the latency spikes have been coming in frequencies of 1 hour. We are continuing to work with the datacenter engineers to resolve this issue and our indicators are pointing to a read cache issue leading to read latency on NIMBLE-SAN 2 which is the cause for the issues.
Jun 06, 2023 - 14:57 CDT
Update - 1/5 datastores on the second nimble array are still continuing to show higher than normal spikes, we are working with storage engineers at our data center to determine root cause and resolve the issue
Jun 06, 2023 - 14:30 CDT
Update - latency is once again subsiding in the last 15 min it has stabilized to normal levels again, and VMs might take a couple of moments to catch up. We are running migration to a different nimble array but doing it carefully so as to not add to the latency problem
Jun 06, 2023 - 12:53 CDT
Identified - We are seeing increased storage latency at a portion of our PHX datacenter. The latency has subsided and we are migrating VM's which is taking longer than usual due to the latency. VM's may experience higher than normal queue depths which will present as slow or degraded performance in most cases. We will update you as we get more information
Jun 06, 2023 - 12:28 CDT