Oh, how positively quaint. DownloadSquad, a fairly prominent blog on the Weblogs Network (featuring posts on free software and downloads) goes on a long rant on – my, how shocking — some blogs are stealing the content on DownloadSquad.com for posting on their own blogs. The author, Gordin Finlayson, even goes to coin a term by calling these dastardly thieves “Blog Pirates”.
Yoinks.
I’m not sure how that kind of article ended up on DownloadSquad, but any blogger that has achieved any kind of success runs into these kind of shennanigans at some point or another. And this even applies to less successful bloggers, actually, given the kind of technology that is used. This isn`t new. It was written about as early as 1995 by both the Guardian and CNet, with both of these publications describing the rise of fake blogs populated with stolen content. `Blog Pirates`, indeed.
People stealing content isn’t new. And in fact, there are a lot of things you can do about it other than writing a post complaining that your content has been stolen. [CHEAP PLUG] In fact, my friend Jonathan Bailey has written about these kinds of things extensively at his own site plagiarismtoday.com — but also has been focusing on blogging related issues over at the BlogHerald. Check out some great posts such as 20 Best Free Anti-Plagiarism Tools, How to Follow Up on a Cease and Desist Letter, and a Content Theft Tale (what we did to follow up someone scraping our content at the BlogHerald)
A more interesting discussion around sites that are ripping others content surrounds the rise of some social news sites that don`t just feature a snippit of news, but the actual and entire post verbatim. Almost all of them have some link back to your site, but that really isn`t the point, is it? Look at a common offender TechAddress.com, which reposts your entire post as someone has “submitted” your post for voting. Thinking broadly, it really applies to any site which is able to scrape/read RSS feeds and republish them publicly. Shared google readers are another end of the spectrum as well.
What makes these latter examples a bit different than frank Splogs are that splogs are created for the purpose of creating traffic and then monetizing it through Adsense, or, through funneling it to other affiliate sites / splog sites for page rank. TechAddress and Google Reader don’t directly benefit in the same way.
But your content is being republished in its entirety which can be distressing. I think one way to get around this is to cut it off at its source, which in many cases is your RSS feed. Making sure that all of your readers are aware that your feed isn’t for repubishing without your express consent is one step. Another more drastic step is actually only doing partial feeds, which is a bit of a contentious issue, as some readers hate it, while others content that full feeds actually improves your subscribership. On the other hand, the only real easy way to stop scraping of your RSS feed is by changing what’s in your feed.
Anyway, this is all a bit of a digression, because isn’t it time we had a bigger and broader discussion of content ‘theft’ in its broadest sense — rather than splogs, which have existed for at least a couple years already? Goodness knows that technology and the tricks that some are using to get around things continue to evolve.
Shouldn’t the conversation?


July 31st, 2007 at 11:32 pm | Permalink
Those ad-driven fake blogs damage in another way, too… they clutter up the infosphere, making it harder to find original voices in the noise.
I think ad networks should stop sharing advertiser money with these people, and search engines should stop indexing obvious frauds. The world’s information is becoming more disorganized.
August 1st, 2007 at 2:56 am | Permalink
i was wondering what a splog was. im confused.
August 1st, 2007 at 7:58 am | Permalink
Joey,
Splog = spam + blog
Which is funny because Blog = Web + log
So I guess that Splog = spam + web + log
Regardless, a splog is basically a spam weblog, usually set up through an automated means and has no useful content included in it.
Hope the helps!
August 1st, 2007 at 9:14 am | Permalink
Jonathan: I wouldn’t say a splog has nothing of value on it. Just nothing original. Some of them have very good content, taken from others and republished.
I agree that the way to hit splogs where they hurt is to cut off advertising revenue to them. There’s a cancer in our midst, and the ad networks have the capacity to cut it out. They just don’t have the motivation.
August 1st, 2007 at 9:24 am | Permalink
Eric: True, I guess that is kind of a miscategorization. But similary so is the idea of nothing original. Some splogs use article generation software to create or “spin” new articles either out of thin air or old works. It’s something new, it’s just junk and unreadable.
That’s becoming a more popular means of creating spam blogs these days as more Webmasters become aware of these issues.
I also agree that hitting them in the ad revenue it the best means but that entails both letting the spam blog stay active while it is researched and requirest the cooperation of ad networks, something in short supply in some cases.
We need to clean up the ad market some first.
August 1st, 2007 at 9:51 am | Permalink
It isn’t talked about much, but I would argue that many of the most well-known technology blogs frequently “borrow” content and ideas from other blogs without properly crediting their source. I’ve seen it in action many times.
The splogs are parasites that will only die when ad sellers penalize that type of activity.
Brent
August 1st, 2007 at 10:08 am | Permalink
Full content RSS feeds seem to be one of the ways that splogs fill their ill intentioned pages. It sucks as a publisher to be forced to switch to excerpt feeds based on this threat though.
Another downside to the slog plague is that Google has (rightfully) been ruthless in trying to expunge them from its search rankings. Unfortunately legitimate websites have been hurt by this as well at times, and in some cases it has caused extreme financial hardship.
August 1st, 2007 at 12:23 pm | Permalink
Spot on Headline.
What I found really amazing is that they were carrying on about cut and paste blogs..WTF? There wouldn’t be many of those compared to automated splogs, and what exactly would the point be to cut and paste when you can just automatically do it? Two tiers of sploggers???
August 1st, 2007 at 12:53 pm | Permalink
[…] Today they have written a little about content theft, it isn't a great article, and doesn't really go into many of the problems, or link through to any authority sources. […]
August 1st, 2007 at 10:23 pm | Permalink
I think what I hate most about my blog’s content being picked up by splogs is the part of the RSS feed they pick up. Instead of picking up the first few sentences and linking back to my site — which might actually HELP me attract more readers — they seem to take a random sentence or two out of the middle or end of the feed. Since my feed ends with a copyright notice, THAT’S what they’re often picking up.
This proves two things:
(1) The copyright notice is clearly useless to stop content from being picked up by splogs.
(2) The splogs that do this are completely worthless to readers that happen upon them.
Splogging is an extremely sore subject with me — as Jonathan B will attest! But any kind of copyright infringement really bugs me. I make my living as a writer and when people steal my words, they’re stealing my livelihood. And that simply sucks.
August 2nd, 2007 at 1:37 am | Permalink
Maria,
I second that suckage. Getting your content stolen really does suck, and perpetrators will only respond to action, unfortunately, and not any kind of appeal to reason, humanity, or fairness.
August 2nd, 2007 at 8:14 pm | Permalink
[…] DownloadSquad Discovers Splogs. Welcome to 2005, Guys. Some more commentary about splogs. On Deep Jive Interests. (tags: splog blogging copyright) […]