Summary

Between 16:30 and 21:40 UTC, customers in the Frankfurt region experienced service issues. New projects could not be created starting around 16:30 UTC, and from 17:45 UTC most projects became unreachable. Service was fully restored by 21:40 UTC.

Impact

Root cause

Our Frankfurt gateway, which routes traffic to customer projects, stopped accepting configuration updates because of an invalid route definition. This invalid route came from a rare edge case during the rollout of the new version of our Beta Analytics capability. In that case, a duplicated analytics configuration was generated for a small number of projects. The gateway treated this as invalid and then rejected all routing updates, leading to a large outage in the Frankfurt environment.

Resolution

To restore service, we removed the analytics-related gateway plugins from Frankfurt and regenerated all routes. This allowed the gateway to accept configuration again. Routes were progressively recreated and service returned between 21:10 and 21:40 UTC.

Prevention and next steps

We have already:

We are now working on: