Summary
Several AU customers experienced website degradation between Jan 14, 2025, 23:47 AEST - Jan 15, 2025, 1:40 AEST.
Squiz identified operational issues with one of our third party network providers. This had a negative effect on Matrix and Funnelback services in the AU, leading to search function disruptions, latency for several customers and some 504 errors. On January the 15th, at 19:51 AEST, during a scheduled maintenance window there was a further observed degradation of service.
This was a re-occurrence of the same issue.
A small subset of AU Customers experienced delays in search results when attempting to utilise the Funnelback search and web functions, as well as some more 504 service outages.
We experienced concurrent intermittent traffic loss to/from NTT in both our Sydney and Melbourne DCs.
The traffic loss was severe enough to trigger automatic rerouting of Ingress traffic to a different Transit provider.
Because the packet loss was intermittent, this rerouting process resolved, then repeated several times.
We intervened manually to force the exclusive use of a different Transit Provider in Sydney. This partially mitigated the issue, but it took some time for routing to fully recover. This also affected some of our internal observability systems. Once the NTT transit was stable, we reverted our mitigations to restore full redundant service.