UK Funnelback Performance Degradation

Incident Report for Squiz

Postmortem

Summary

On April 24, 2025, Squiz experienced a service degradation affecting several Funnelback DXP customers in the UK.
Squiz alerting detected an increase in traffic, which led to heightened monitoring and as a result a Major incident was declared at 09:01 BST.
Prior to this declaration some customers may have experienced periods of slower than usual response.
Whilst monitoring was in place, an unexpected spike in traffic from a small number of sources resulted in a strain on system resources, and led to a degraded service.
Rate-limiting was applied, as well as blocks to specific traffic which was deemed to be non-human.
The system began to recover and full restoration of services was confirmed by 09:59 BST.

Customer Impact

A subset of UK-based customers using Funnelback DXP services experienced slow response times, degraded search performance, and intermittent 504 service errors.
The impact was localised to the UK region, with no disruptions reported in other regions.

Issue, Resolution, and Mitigation

Upon investigation, Squiz engineers identified unusually high levels of traffic from a small group of actors.
This traffic surge led to system congestion, resulting in delayed responses and intermittent availability for some customers.
The issue was addressed through the following actions:

  • Squiz detected early signs of degradation via system alerts.
  • Engineers investigated and declared a major incident at 09:01 GMT.
  • High-traffic sources were identified and rate-limited to ease the load on the system.
  • Following the implementation of rate-limiting, service performance began to improve steadily.
  • Engineers actively monitored the environment to ensure continued stability.
  • By 09:59 GMT, full service was restored and the system stabilised.
  • Bot management controls were enabled to mitigate excessive automated traffic, and remain in place to ensure continued stability.
Posted Apr 25, 2025 - 20:30 AEST

Resolved

Search performance has returned to normal, and the incident is now resolved. We will continue to monitor closely, but no further impact is expected.

Thank you for your patience.
Posted Apr 24, 2025 - 18:59 AEST

Monitoring

A fix has been implemented, and we are actively monitoring system performance to ensure the issue has been resolved.
We will continue to observe closely and provide a final update once we confirm that everything is operating normally.
Posted Apr 24, 2025 - 18:56 AEST

Identified

We have identified the issue and are now working on implementing a fix. Further updates will be provided shortly.
Posted Apr 24, 2025 - 18:46 AEST

Update

Our engineers continue to investigate the issue and are working to identify the source of the problem.
Posted Apr 24, 2025 - 18:31 AEST

Investigating

Our monitoring has detected a performance degradation affecting UK Funnelback customers only.

We are currently investigating the issue and will provide updates as soon as possible.
Posted Apr 24, 2025 - 18:12 AEST
This incident affected: Squiz SaaS Hosted Instances.