Imagine your Magento store is getting flooded with traffic and there’s no time to think.
The first we knew was when Nexcess told us one of our sites had been turned off due to “an inordinate amount of traffic”, as they concisely phrased it. The site in question was on a shared server and was causing downtime for other sites we weren’t even aware of. Our first defence was to configure content delivery (a CDN). We signed up to a service which gives 25GB for free and that granted us a few hours to find out what happened. The sudden surge had come from a brief mention in a Huffington Post article, effectively slashdotting us.
Notice the dire warning about using a CDN? We can overcome that with correct CORS which the CDN service will probably offer advice about. In brief the least we need to do is add the following to the site’s “.htaccess” file:
<IfModule mod_headers.c> Header set Access-Control-Allow-Origin "*" </IfModule>
To be more correct that “*” should actually be the new CDN domain, to protect against cross-domain attacks. With this in place the flood dropped by 60%~70% and the site was restored without sacrificing any functionality.
Three days later and the site was taken offline again. This time Nexcess had fixed the problem for us. Some how the site’s domain got listed as a BitTorrent tracker and there were thousands of requests to paths like “/announce” and “/announce.php” which are easy to spot. Also most of the clients had “torrent” somewhere in their user agent string, another dead giveaway. The Nexcess techs simply filtered out these requests and the site was restored once more. It is now more than a month since the flood started and we still see these attempts in our logs.
The ordeal reveals a problem with Magento’s way. Missing pages are handled in a dynamic way just like all of Magento’s other pages and this is slow. The requests to “/announce” all got the same 404 response but it was being rendered in full each time. There wasn’t much time to optimize because there was another cloud on the horizon…
On the same day as what we now call “the torrent”, we also spotted many mysterious requests from clients which were only seen once each when there should be several requests per page. It may have been misrouted traffic meant for another site but that didn’t seem likely. Fearing the worst we searched for advice on DoS attacks against Magento and the first hit suggested “Hopefully you’re not running Magento”. Thanks Google!
If it was a DoS attack then it was very well disguised because every request looked legitimate. With so much load on the server pages were taking over 2 seconds to render. Again we had to reduce the load and there wasn’t time to try anything fancy. The shared nature of our leased server ruled out a Varnish cache. Upgrading to a larger server would take too long. If we had planned in advance then we might have chosen an elastic set up and wouldn’t even have noticed. With haste we installed Gordon Lesti’s Full Page Cache (Lesti_FPC on github). Gordon Lesti claims a x1.97 speed improvement in his testing. Without any tuning we instantly saw a x4 improvement and nothing broke.
With the flood abating we can relax and shore up defenses. By necessity the site has a more complex menu than most and this was adding significant load to each page. Now starts the long process of profiling and fine tuning. We raise a (virtual) glass to Mr Lesti and Nexcess diligent staff