Internet
How to
Newspapers
Twitter
SEO advice
Categories: How to, Internet, Newspapers.

Jan Moir and the PCC: why its website crashed

October 19, 2009 10 Comments

A lot is being written about Jan Moir's nasty attack on Stephen Gately, and the supposedly organised internet campaign (quick, shred the memos) that led to the Daily Mail squirming (including removing the adverts from the page).

As a result of her poisonous bile, the Press Complaints Commission (PCC) received 21,000 complaints - more complaints in a single weekend than the regulator has received in total in the past five years.

This would explain why the PCC website ran so slowly on Friday and today. Actually, no it wouldn't. 21,000 isn't that many.

I mean, it's not running off some two-bob hosting arrangement that likely to fall over at the first sign of too much traffic is it? Oh, right it is.

And it's not like it has overlooked simple security measures and is exposing its database on a public-facing server. It is doing that too, you say?

And it's not like it has had all weekend to fix the problem and it's still going on. What? It is?

Warning: the rest of this article is about the PCC's web hosting arrangements. I've tried to make it untechnical though.

How web hosting works

There are two main options:

  • You can have your own hosting solution: Costs money.
  • Or you can go down the cheap route, and share a server. If you share a server then it's difficult to do much about traffic surges, and every other site on the server gets hit as well if your site gets a traffic spike (and vice versa): Cheaper.

What was the PCC doing?

The PCC site struggled badly on Friday - and still is today, Monday - under the weight of traffic. If you do a reverse DNS check for the PCC, you can see where it's hosted and its IP address: jack.codecircus.co.uk at a host called Rackspace and 83.138.130.133.

So who are Code Circus? Well, they say:

Although Code Circus did not build the Press Complaints Commission website, we were asked to take on the responsibility of managing their suite of sites. Contracted on a monthly basis, we provide ad-hoc technical support services, including HTML development, content management, web application restore and technical audit services.

Looking for the database on localhost. Not good.

Looking for the database on localhost. Not good.

On Friday, their own website was being redirected to v2.codecircus.co.uk, which was also sitting on jack.codecircus.co.uk - and both this site and the PCC site were showing an identical error:

xyBox1.3.0e - [Fatal Error]:
unable to connect to database host
Details: Can't connect to MySQL server on '127.0.0.1' (4)

This might not mean much to you but it means the 127.0.0.1 bit means that the MySQL database and the website itself appear to be running on the same server.

This is generally a bad idea as if you can hack the public-facing website and access the server, you can also access the backend database (and who knows what is stored there - let's hope the PCC don't keep all the complaint details in there ...).

Someone even more technical than me might like to confirm this ...

If the PCC were sharing a server to save money, we would be able to do a reverse IP lookup on the 83.138.130.133 address and see who else is there. Oh look! If we do, we find all these sites:

2-dk.com, africanpressagency.com, cavgds.co.uk.codecircus.co.uk,cdaperform.co.uk.codecircus.co.uk, danvirgo.com, dimensional-media.com, grantbarnett.com.codecircus.co.uk, reporter.codecircus.co.uk, secure.thetoolman.co.uk, shop.cfauk.org, simonrumley.com.codecircus.co.uk, tmm.codecircus.co.uk, uksip.org.codecircus.co.uk, v2.codecircus.co.uk, www.2-dk.co.uk, www.airsafetyinyourhands.com, www.aquista.com, www.atlas.guernseyci.com, www.atlasgibraltar.com, www.atlasoffshorejobs.com, www.bicha.co.uk, www.bolero.net, www.cavgds.co.uk, www.cdaperform.co.uk, www.cerethouse.com, www.cfauk.org, www.codecircus.co.uk, www.colouring-in.co.uk, www.cut-coms.co.uk, www.emberjd.com, www.estelabravo.com, www.grantbarnett.com, www.jhw.co.uk, www.mymediasafe.co.uk, www.newwavefilms.co.uk, www.omexperts.co.uk, www.onlinemediaexperts.co.uk, www.pcc.org.uk, www.pcc.org.uk.codecircus.co.uk, www.phoebusassociates.co.uk, www.robertcohen.info, www.simonrumley.com, www.stephenaustin.co.uk, www.sun-sea-golf-spain.com, www.theagency.co.uk, www.thebrianjacketletdown.com, www.thecollectivedesign.co.uk, www.uksip.org, www.vistacarespain.co.uk, www.vocalbaobab.co.uk, www.wordswork.co.uk, www2.uksip.org

I tried a few of these on Friday, and they were all struggling as well - database errors or just nothing appearing. I'm not sure what the UK Society of Investment Professionals or the Civil Aviation Authority make of this (actually, I'm not sure what I think of the CAA's airsafetyinyourhands.com having air safety in anyone's hands if this is their hosting arrangement).

Conclusion

The PCC is supposed to deal with complaints about sensitive matters. To cope with this, it should put in place (1) scaleable web hosting (that both the software supplier and hosting partner can achieve) to ensure it can cope with any surge in traffic and (2) security checks to ensure its backend is secure (which include not just checking its own site's security but every other one on the same server).

It appears to have done neither. Which is what I imagine it will do with the 21,000 complaints it has received.

You might also like
  1. Website no longer hacked. Phew.
  2. PCC rules on Jan Moir: a strange and troubling ruling.
  3. ABCe: please sort out your terrible website (again)
  4. You need a paper licence to link to the Royal Mail website
  5. Nofollow: How to link to someone or something you detest (I’m looking at you Jan Moir)

Share this post

Follow me on Facebook or Twitter

10 Comments »

  • Let's imagine that the PCC follows your advice and moves the database backend onto a separate server. Someone hacks the web server. Stored on the web server are the connection details to the database, so they can still access the database, albeit slightly less easily.

    • David Campbell says:

      Actually - hacking a server would only expose the database credentials if those credentials were stored in a format available to a non privileged user, of course nobody does that anymore, do they?

      Regardless, the main the main argument:

      "And it's not like it has overlooked simple security measures and is exposing its database on a public-facing server"

      Well no, it isn't. The database isn't accessible whatsoever.

      telnet http://www.pcc.org.uk 3306
      Trying 83.138.130.133...

      A little knowledge is a dangerous thing

      Poor article

  • Terence Eden says:

    The way the PCC is set up ensures that it will only ever deal with complaints from those directly named / involved in a story. Short of a paper seriously insulting a huge group of people, it's unlikely to need that sort of scaling.

    Looking at their press releases, this year it seems they've dealt with less than 50 complaints - http://www.pcc.org.uk/news/press-2009.html

    What's interesting is how the co-hosting model can be adapted to the "Slashdot Effect". When my website was featured on Boing-Boing, I had to rapidly pay for extra bandwidth or face being cut off halfway through the month. Is there any sensible way for those who share servers (rightly or wrongly) to manage sudden and infrequent bursts of interest?

    T

  • I'm going to have to disagree with you on some of the above points - firstly, using a shared server isn't just about being cheap, it's about being practical. Not everyone can warrant the cost of dedicated box, nor the maintenance costs involved.

    Shared servers are common in the world of hosting, and often operate perfectly adequately come rain or shine. Obviously, they're not going to stand up to a slashdotting/digging, but then again, neither would most single dedicated boxes, unless well configured.

    Secondly, having the database server on the same box as the webserver is perfectly acceptable for the vast majority of websites out there - sure, it's not scalable, and it's certainly not very resiliant, but for the average site it does what is required of it.

    Having a DB server on the same box as the webserver also doesn't make it any more insecure - if you compromise the site and get hold of the DB login credentials, you'll be able to get access to the database, regardless of whether the server is local or remote.

    As you said above, "more complaints in a single weekend than the regulator has received in total in the past five years." - this site has clearly been happily running away in its own little shared environment, fit for purpose until the day it gets an unexpectedly large volume of traffic.

    That's not to say that the PCC aren't at fault for having provisions in place for the worst case scenario - regardless of whether the site is on a single box or on a load balanced cluster, if it's not well built, and hasn't had any thought applied to scalability or optimisation, there's trouble ahead. Clearly in this case, little thought was given by either the PCC, or their hosting partners, who by now should have pulled the site onto a box which could cope with the demand.

    Just my 2c.

  • Kevin says:

    It seems like a perfectly reasonable set-up to me.

    There's stability, then there's pissing away money overprovisioning to the point where you're able handle 1825 days of traffic in one day, just on the off-chance Charlie Brooker and Derren Brown decide to Stephen Fry you.

  • Thanks for the comments. I'm sure you're probably right (was a bit at the edge of my technical knowldge). But I do think they could have done something better since friday - the site is still 'running' incredibly slowly ..

    (Plus the PCC isn't really an 'average' site - it is the press's watchdog!)

  • I couldn't agree more Malcolm - if one of my sites got hit as hard as the PCC site, I'd want to shift it off the shared server as fast as possible (if not just to minimise the impact on the other hosts sharing those resources), and onto something a little more suitable.

    There are plenty of quick and easy alternatives to expensive dedicated boxes, such as RackSpace Cloud Servers/Sites/Files, which can all be set up in minutes, and don't require expensive long term commitment.

    The site is now pretty quick for me, although I'm on a fast connection - I'm now wondering if the problem was with the webserver, or whether it came down to poorly written SQL/unoptimised DB tables.

    At the end of the day, it's horses for courses. I've stuck sites on single boxes and seen them withstand a torrent of abuse. I've also launched sites on heavily load-balanced servers and seen them yo-yo under the pressure.

    You can't forsee a massive spike in traffic, but you can always make an effort to minimise the impact once it does.

  • Disclaimer: I know someone who works at the PCC, though not in tech.

    "more complaints in a single weekend than the regulator has received in total in the past five years.

    This would explain why the PCC website ran so slowly on Friday and today. Actually, no it wouldn't. 21,000 isn't that many."

    Right. I'd like to take issue with your numbers, here: "21,000 isn't that many",

    1/ Except that it is relative to the regular traffic.

    2/ And that's assuming that everyone that visited the page actually made a complaint, more likely a huge number of people followed blind bit.ly/tinyURL links and didn't do anything - but those are still hits to the site.

    3/ Each of those 21,000 complaints is at least 4 or 5 clicks to complete the complaints process. Let's assume 5% of people who hit the link made a complaint (reasonable considering the follower count of fry, brown, etc - more so that they have many non-UK followers), that's actually much closer to half a million requests.

    A more reasonable estimation - no?

    I'd say the security issue is pretty much a non-issue, in the scheme of things. I would agree some more forward planning, particularly failover process should have been done. The problem is, this is such a huge increase in traffic it's difficult to account for.

    Just because the PCC is the press watchdog doesn't mean they have an infinite budget to prepare for something like this, not to mention justifying it.

    I imagine this will be a bit of a wake up call for their tech team and hopefully things will improve in the future. Things going tits up on a Friday never helps either.

  • David - those are all reasonable points. To a point.

    I'd like to contrast what the PCC's response has been with my own response when 3,000 people visited my blog in about 30 minutes and my host refused to answer my emails and phone calls when my monthly limit was exceeded.

    It took me about 12 hours to get the site moved to a new host and everything up and running again.

    Little about the PCC setup has changed from friday until earlier today as far as I can tell. Maybe they've hurled some load balancers in there that I can't see. But they seem to be having little effect until recently. The PCC site was slow earlier today, the other sites on the server were slow (things do seem to have improved now).

    The PCC, unlike my blog, is the body that self-regulates the press. It has a duty and a responsibility to allow people to make complaints. 21,000 may be lots - but how many more would have complained if they could have got on the site?

    As you say - a wake up call. We can agree on that.

    • The main gist of my point was while those of a more... technical persuasion are aware of the important of the forward planning involved here, justifying the cost of 100% preparedness is a difficult job. To say they're "cheapskates" is quite a sensationalist way to phrase it, and then end up back-tracking a bit - could be considered a bit Daily Mail-esq :P

      I've got plenty of thoughts about ways the PCC could better handle traffic like that in the future, i'll be sure to include some of the points here - hopefully I can get them to some receptive ears.

Leave a comment!

Add your comment below, or trackback from your own site. You can also subscribe to these comments via RSS.

Be nice. Keep it clean. Stay on topic. No spam.

You can use these tags:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

This is a Gravatar-enabled weblog. To get your own globally-recognized-avatar, please register at Gravatar.