» Published on
The incident was triggered by a bug in a caching mechanism within a component upstream of our platform. This component, which is critical for directing HTTP requests to your applications, was not being properly supplied with the metadata needed to route the HTTP traffic to the appropriate containers. As a result, traffic may have been routed to containers that were no longer operational, either due to a redeployment or a modification in the topology of your applications, such as a scale down operation.
Although preexisting, this issue manifested as a side effect following an update to our infrastructure servers.
Our development teams have pinpointed the exact origin of the problem and a fix is currently being developed.
We have also initiated enhancements to the configuration of our monitoring probes to increase our ability to detect and respond more promptly to such incidents.
A comprehensive retrospective of this incident is also scheduled to explore additional potential improvements.
» UpdatedThe situation has been stable for 30 minutes. We are still investigating but the incident is considered closed.
If you are still experiencing issues, please contact our support.
» UpdatedThe 502 errors should have disappeared following our operators intervention.
Investigation is still ongoing in order to identify a root cause.
» UpdatedWe detected an unusual rate of 502 errors when reaching apps hosted on the osc-fr1 region.
Our operators are currently investigating the issue.
We'll keep you updated.
» Updated