AWS Cloud watch + yCrash = Monitoring + RCA

On October 11, 2021, GCeasy experienced an outage due to HTTP 504 errors when customers uploaded logs, traced to a new code deployment on October 9. Monitoring revealed increased CPU and database connections. A root cause analysis using yCrash identified an inefficient SQL query, leading to timeouts. Removing the query restored functionality.

Up ↑