Redis Outage & Security Observer Race Condition
Challenge
Redis became unresponsive during a working day and could not be restarted through the hosting backend. With Redis down, session handling failed, and customers could not add items to cart, effectively taking the shop's checkout offline.
What Was Built
After coordinating with the managed hosting provider to restore Redis, the root cause of the cart issue was traced further: a VaryContentLength observer in the custom security module was interacting unexpectedly with the Varnish integration layer. Under load, the observer was competing with other response-phase observers on the http_response_send_before event, creating a race condition where the session was closed before the observer could finish writing response headers. Moving the observer to an earlier event reduced the collision frequency but did not eliminate it. The interaction appeared specific to the Varnish stack in use. The specific observer was identified as a non-critical cache poisoning defence (risk: very low, exploit complexity: very high) and disabled. A separate Tab Nabbing protection observer in the same module was fixed, tested independently, and left active - it did not share the race condition.
Outcome
Shop restored to full operation the same day. The decision to disable the specific observer rather than continue debugging indefinitely kept downtime minimal while preserving all genuinely critical security controls.
Have a similar challenge?
Get in touch - no sales pitch, just a straightforward conversation.