Introduction:
On July 8, 2024 between 8:00 - 11:00 AM MT a subset of US customers intermittently experienced errors when accessing the Admin console or the self-service portal.
Issue Summary:
Customers affected during the outage were unable to process user login requests.
Resolution:
The query was redirected to another node that was not experiencing latency and the affected services were restarted.
Root Cause:
A slow query introduced latency into the application.
Solution and Mitigation:
Engineering teams identified a slow query that was causing latency in the application. The query was routed to a node that was able to serve requests without latency, additional tuning is being performed to ensure optimal performance moving forward.
Conclusion:
We recognize the disruption of our services for our customers in the US region during normal working hours. Our commitment to ensure stability, full functionality, and industry leading uptime, remains our top priority.