During routine monitoring, Squiz identified operational issues with multiple Funnelback servers, leading to search function disruptions for several customers.
A subset of UK Customers may have experienced delays in search results and encountered 500 errors when attempting to utilise the search function.
Squiz engineers were alerted to errors and timeouts originating from our Squiz Funnelback services.
Subsequent investigation revealed that the search session functionality within Funnelback was causing slow or erroneous requests, leading to a build up of requests within the search query processing pipeline. Requests utilising the search session feature were subject to slow response times or termination. In turn this impacted performance and resulted in timed out searches.
In response, we isolated the specific searches and their connection behaviour to our session feature and, where needed, disabled/paused the use of this feature temporarily to allow the query processing pipeline to recover. As a precautionary measure, resource allocation to the query processing pipeline was increased.
As part of our standard process we initiated a period of heightened monitoring leading to resolution on April 25th at 13:00 BST
In light of this incident, Squiz support staff conducted a thorough review of our UK Funnelback systems to preempt future disruptions, including the expansion of memory resources. In addition, measures have been taken to enhance processes enabling fast-tracked resolutions to similar incidents in the future.