Hi everyone, this is Shelton and co-founder and server admin for beyond.ca. Kenny asked me to do some guest blogging here on his blog so I decided a behind-the-scenes type article regarding the Digg Effect. When you read about how sites were taken down, or barely survive the traffic onslaught known as the Digg Effect you rarely read anything more than “the traffic was insane, my server died!”. This write-up is quite technical, if you want to skip the technical stuff you can skip to the pretty graphs at the bottom of the post! :)

As the server admin of Beyond Car Forums, a car enthusiasts forum started by Kenny and I, I deal with all the servers and network aspects of the site. Monday was the first true test of my scalability plan after a story from our forums ended up on the home page of Digg. It stayed there for the next 17 hours, sending a huge surge of traffic.

Not an Ideal Server Setup
Beyond.ca is fairly busy, averaging about 800-1000 concurrent users on a typical day. We use vBulletin as our forum software, with some back-end customizations to make maintaining a site of this size a lot easier. As some of you may know, vBulletin is written in PHP using the mySQL database, much like WordPress. When confronted with PHP/mySQL, the logical choice would be to use the Linux, Apache, PHP, mySQL combination. Luckily for us (or unluckily depending on your view of Windows) our server farm is subsidized by a local software company. Our server setup, and the amount of traffic we recieve is used for R&D purposes to test their web-based software (specifically IIS/CGI performance). Because of this, we’re already at a disadvantage when it comes to optimal server performance but we made the best out of this situation.

About three years ago, we encountered our first big infrastructure test, which necessitated a new server plan to ensure we can sustain the additional traffic. The story in question happened during the Kobe Bryant trial, when the accuser visited our home town to visit friends, and talked candidly about the case. Google indexed it, and we hit a record 1200 concurrent users. At this time, our single server solution was no match for the amount of traffic that google brought on, and down went the site. In an attempt to get things back up and running, we had to separate the web server from the database server. The new setup was slow, but the site was available, which allowed us to drive advertising revenue to make up for the bandwidth costs.

At that point on, we knew that as the forum grew, and the next time we encountered a surge in traffic, we had to be prepared. Expanding on the earlier server setup by splitting the web and database servers, I added more resources to each cluster. The setup we ended up with was two web servers and a single database server. Through our earlier experiences and looking through logs it was evident the web server was the bottleneck. By splitting up the web and database servers into different server clusters, I was able to add server resources where they were needed quickly.

The Beyond Server Setup Before Digg
Our database server is hosted on a Quad Core server, which had an average CPU utilization of 10% during normal peak hours. Our web cluster consists of 2 Dual Core servers, averaging about 40% utilization per server. Due to our arrangement with the software company, we have access to several Dual Core and Quad Core servers that are normally used to replace faulty hardware. Bandwidth wise, we have 100mbps at our disposal, connecting pretty much directly to a major backbone. We get billed by the data transferred, which allows us to have virtually unlimited potential bandwidth, while maintaining much lower costs based on how much of that we use. Theoretically speaking, the Beyond clusters can handle 2.5x the normal traffic load without human intervention, and with the overhead that we allow, the forums respond instantly to any query during normal peak usage.

Our Forums Hits the Homepage of Digg
Tuesday morning at 5am, the story was on the home page of Digg. Our average bandwidth usage is fairly steady throughout the day at 2mbps, but Digg had suddenly driven our traffic up to 4mbps at 5am. The servers alerted me via text message to the unusual traffic patterns, and I drove into the office early to see what had happened. A quick look at the referring sites made me realize that it was the beginning of the Digg effect, so I worked feverishly to prepare for the traffic onslaught that would be coming my way.

I knew that our database server was up to handling the traffic, but the web server’s were not. Out came the spare servers to help handle the forum traffic. In total, by 7:45am, I had added a Dual Core web server, and a Quad Core web server to the web server cluster. Before I could make the DNS changes to point to the new servers, I noticed that Beyond.ca was completely unusable. Both web servers were maxxed out at 100% and nobody could view the site.

Quick Sidebar on DNS
We use round robin DNS (basically, a domain name pointing to multiple A records) to distribute traffic to our web server cluster. There’s a much better solution, by using a load balancing device such as the LoadMaster, but to keep costs down, our forums use DNS with a domain record expiry of 30 seconds, which does tend to load up our DNS servers, but allows us to make changes on the fly throughout the world. This is a much cheaper solution for us and it works, which is all that matters.

Bring On The Calvalry
Once I made the change to DNS to distribute the traffic to the new webservers, the site became responsive again, and traffic climbed up to a whopping 11mbps sustained, with peaks of 15mbps. Concurrent users hit a record high 4085, mostly from queue’d up users slowly catching up.

diggeffect4.jpg

The database server jumped up in utilization, but nothing much to worry about. As predicted, it handled the server fine, even though it was hitting 100% for very short bursts.

diggeffect3.jpg

Each web server was maxxed out pretty hard, so even though the site was responsive to the majority of viewers, some viewers (under 1%) were rejected due to the maximum connection limits I imposed. The limits were put in place to ensure pages were coming up quickly and not have a growing queue which could potentially crash the server. CPU wise, it’s averaging fairly high at around 90%, while staying at 100% most of the time.

diggeffect2.jpg

One of the side effects that I failed to plan for was our mail server… because of the high # of new registrants hoping to respond and high # of thread subscriptions, the mail server has a growing queue that probably won’t let off until the traffic dies down, which means it’s very tough for visitors to comment since they have to wait for their validation emails. It’s currently averaging 90% CPU utilization.

diggeffect1.jpg

Overall, the first Digg Effect was handled very well by good planning. The story is still spreading, local news outlets are interviewing the parties involved as well as me, and traffic hasn’t died down at all yet, still maintaining 7-8mbps sustained. Over 99% of the visitors are seeing the story, and ad revenue is up for this period, which hopefully can cover the additional bandwidth costs. As a general idea, in the first 10 hours, the story moved over 600GB of traffic. After 24 hours, traffic is slowly trending back towards normal levels, but is still over our normal levels. The story now has over 2,700 Diggs, but the traffic is mostly coming from all the forums and blogs that have linked to the story (at first glance, over 500 sites linked to the story!)

diggeffect5.jpg

We got a taste of the Digg Effect, and learned a lot from it and we’re looking forward to handling the next wave of traffic thrown our way!

Did you like this post? If so share with others: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Netscape
  • Reddit
  • del.icio.us
  • StumbleUpon
  • Digg


Other posts you may be interested in:

  • Beyond.ca Experiences The Digg Effect
  • Digg Mob Launches Attack Against… Digg??
  • With Success Comes Trouble
  • Put Your Blog On A Diet And Make More Money
  • Attack Of The Digg Clones