Following on from his very helpful post last week, Plagiarised Text or Images – What To Do Guest Author Paul Ward explains How Site Scrapers and RSS Bandits can hurt us, even when they only duplicate a few links of our content.
Along with the amateur who steals a page at a time, there are the more professional crooks who use programs called site scrapers. Give these programs a page URL and they follow links, grabbing every file they can: pages, images, scripts, style files. The better scrapers can be set up to automatically change things like affiliate ids. They can mimic browsers so webservers can’t detect and block them.
Add on a bit more software and the scraper can transfer all the files to a new host. Kick one off and go away for an hour or two – when you return you have a complete copied site, links, addresses and affiliate codes all changed to suit you.
The poor relation of a scraper is the RSS bandit. RSS feeds are a good thing in many ways: they aid search engines (Google likes them) and they can help you propagate your site to others (many of us have blog feeds and accept other people’s feeds).
An RSS feed typically contains a set of information for each page on the feed: in lens terms that’s URL, title, intro and any images associated with the intro. Grab a feed, add a bit of programmed code and you have a self-updating site.
Why Do They Hurt Us?
They steal traffic through beating the originals on search results. They typically mimic a frequently updated site to a search engine. By pulling in content from many places they impress more than each individual source. By stealing from Squidoo, where many people understand the basics of SEO, they further impress search engines.
In addition, the crooks may carry spam or porn ads. Nobody wants good original content associated with acne cures or worse, especially if their name is transferred with the content.
Don’t We Get Links From RSS Bandits?
Many such thieves will leave a link back to the original. This fools some people into thinking that there’s no real harm. There is: apart from the possible dubious ad associations, a full intro may contain enough text to convince Google that the original is the duplicate – so the original suffers in SERPs. There’s also a danger that your affiliate id may be copied unchanged – so your id is on a site that probably breaks the terms of your affiliate agreement. If you don’t do anything about it you run the risk of losing AdSense, Amazon and other reputable companies.
Why Do The Crooks Do It?
Several reasons: the obvious one is to make a quick buck from changed affiliate ids and other advertising. They may just be seeding the ground to sell the site or domain – get the money from a gullible idiot and disappear.
The amounts involved don’t have to be great. The costs are low and the profit can be a tidy sum in some countries, if not in US or UK eyes.
Should I Remove My RSS Feeds?
No: there are other ways to steal content anyway and you benefit from the feeds. Better to get the copies removed.
What Else Should I Do?
This is plagiarism. It’s illegal and there are quite straightforward measures to take. These are described in Plagiarised Text Or Images? What To Do.
There are forums and Facebook groups where you can get help and support. You can use sites like Web Of Trust to record bad sites. Spread the word, kill off the crooks.