EMEA instances experiencing "400 Bad Request" errors when accessing Admin Console or the Self-service portal
Incident Report for PrinterCloud
Postmortem

EMEA instances intermittently experiencing "400 Bad Request" errors when accessing Admin Console

Introduction:

On August 8, 2024 between 12:06 UTC - 13:27 UTC some EMEA customers experienced “400 Bad Request” errors when performing any action requiring database connection. 

Issue Summary:

Connections were held up due to a portion of code which, when under heavy load, resulted in a reduction of performance within the database. A slow query backed up requests which collectively overwhelmed the database.

Resolution: 

We implemented a roll back to a previous version of code. Going forward, changes will be further scrutinized to ensure that the code is optimized and that it will perform well under high volumes of requests.

Root Cause: 

A stored procedure thought to be safe did not perform well under high volumes of requests.

Solution and Mitigation:‌

The changes were rolled back to get customers running smoothly again. We are looking into better ways to monitor changes and to be notified more quickly when abnormal errors start occurring. The changes will be scrutinized and modified to ensure that requests are handled smoothly. Lastly, we are implementing additional process controls to simulate production behavior with high traffic volume to ensure our queries will perform well.

Conclusion:

We recognize we specifically impacted our customers in EMEA during their normal working hours, as well as our customers who are 24/7 businesses. We’re committed to discovering new roll out strategies that keep in mind all customers’ working hours. We are also committed to investigating our architecture to prevent smaller services being able to lock up core functionality. We thank you all for your patience and we are eager to proactively improve our systems to continue to help all on their digital transformation journey.

Posted Aug 13, 2024 - 09:54 MDT

Resolved
This incident has been resolved.
Posted Aug 08, 2024 - 08:47 MDT
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Aug 08, 2024 - 07:33 MDT
Investigating
We are currently investigating this issue.
Posted Aug 08, 2024 - 06:45 MDT
This incident affected: PrinterLogic | SaaS Frankfurt (PrinterLogic | SaaS Frankfurt).