Why Murdoch should WIN any lawsuit against Google (stay with me ...) 11
There's a story in Australia that News Corp. is preparing to sue Google and Yahoo to stop both from linking to, and quoting News Corp content. It comes as Rupert Murdoch promises to start charging for online content across his company's news sites.
The suing story has prompted the usual hilarity, with comments such as if murdoch sues google & yahoo over news rather than use robots.txt file, it'll be a short, embarrassing lawsuit.
But here's why Murdoch might have a case ...
Robots.txt isn't a panacea
The usual response to newspapers' complaints about Google is to say 'just use robots.txt to keep them out.' This was Google's response in its two fingers to the news industry.
However, most people don't seem to realise that it's hard to stay out of Google News and remain in the main Google index:
Please keep in mind that the robot we use for Google News, called Googlebot, is the same robot that we use for Google Web Search. This means that any settings you modify for Google News will also apply to Google Web Search. (From Google Support)
There's a difference between Google News and Google Search
Google search is a way for a user to enter a term and for Google to show relevant pages. Google News these days looks like a fully fledged news aggregation service - check out its front page, and tell me how much that differs from a publishers news home page?
Just because publishers are happy to appear in normal search results, doesn't mean they want their content used for free to create a rival news source/product. But there's no way to use robots.txt - google's supposed answer - to draw this distinction.
Google is ignoring ACAP
Publishers have attempted to help Google out with their own protocol called Automated Content Access Protocol - a way to build on robots.txt and allow better control over how their content is used.
Google won't implement it saying that: "Our guiding principle is that whatever technical standards we introduce must work for the whole web (big publishers and small), not just for one subset or field".
But Google already draws a distinction between big and small publishers. I publish this blog, but I'm not allowed in Google News, even though I'm in the main Google index.
Conclusion
I'm not saying that any publisher will actually want to stay out of Google. But robots.txt isn't the answer to the problem of how publishers get paid for or control access to their content.
You might also like
- Google will give Murdoch what he wants if he renames the Sun as the Wapping News Journal
- Hey, James Murdoch: How about thanking the BBC for all your traffic?
- How Murdoch CAN charge for online content - but can any other papers?
- Google indexes 168,000 pages of Bing's social search results
- Yahoo trending - better than Google Trends UK

Hm. This is an interesting defence of a business model that's been superseded, but it remains kind of dumb. Should I be sued for telling someone about a News Corp story? But what if I was to read out that story in a public place or even to a family gathering? Does that make me a bad person? Shouldn't they be buying their own copies of his papers? What if I give someone a copy of the Times on the train after I've finished with it?
I'm afraid I don't buy the "have your cake and eat it" approach to robots.txt. I can fully understand why Murdoch and the other media barons want to wrest back control - but it's, at best, a rearguard action against the horde. If he doesn't want people using copyright material, he has an option and should use it. His loss, I suspect...
(Besides, and maybe I'm being horribly uninformed here, doesn't a subs-only site effectively mean he can still have keywords on Google search and even News, but the links hit a pay wall? That's what seems to happen with my Google News Alerts referencing Financial News, for example...)
I'm not really defending him - I just think the 'oh use robots.txt' response is a bit misleading.
But building on your analogy, if you hired a venue and charged people to listen to you reading out copyrighted content, I guess they probably would sue you ...
Malcolm,
Are you implying that it is not possible to unsubscribe from being indexed in Google News?
You can't use robots.txt to stay out of google news without also affecting your position in the main index. Obviously you could take yourself out of both ...
Yes, but I meant something else - if you can't get out of Google News index using some other way, like for example formally asking the Google News team etc.
Peter - there was that Belgian court case, where Google did say: "And if a newspaper does not want to be part of Google News we remove their content from our index –- all they have to do is ask." So maybe you can.
Malcolm, I work with the Google News team in the U.S. Sorry for adding this comment so late after the initial posting, but I want to make clear that publishers absolutely can be removed from Google News while remaining in web search. The vast majority of publishers choose to make their content discoverable in Google News because they recognize the value that it provides: We send more than 1 billion clicks a month to news publishers worldwide. But publishers are in control over where and whether their content appears. If a news site is included in Google News and no longer wishes to be, all the webmaster needs to do is tell us that. Here's a link from our Google News publisher help center with simple instructions: http://www.google.com/support/news_pub/bin/answer.py?hl=en&answer=94003
Hi, Chris. Thanks for commenting. Could you clarify something? The original post of google's that I referenced above suggested using robots.txt to stay out of google news. That will also keep you out of the main google index, however, as I pointed out.
Is the link you've just suggested one that lets you manually ask to be taken out of google news but remain in the main web index?
Malcolm, The blog post in July provided a reminder for how publishers can remove themselves from Google search results in general, if they choose to do so. Using robots.txt would apply to both Google News and Google web search.
But if publishers want to be removed from Google News only, there is a simple way: Just tell us, using the link in my comment above. If they're already in web search they will remain there, and our search algorithm won't treat them any differently.
Chris - thanks a lot. I'll add a link to your comment over at http://onlinejournalismblog.com/2009/08/06/should-murdoch-win-any-lawsuit-against-google/ where I cross-posted this.
PS I'd like my blog to be in google news, just to be clear!