It seems we keep hearing tons of rumors and concerns on Google’s handling of Duplicate Content. As no one knows the facts here is what we have observed.
Duplicate content ON or OFF your website is bad ONLY on Google.
Article marketers will claim it is only bad on your site; this is not a fact at all and more likely a misconception. Evidence shows that a syndicated article or news release will start out with several hundred indexed listings, only to wind down to a remaining 3-5 indexed listings after a few weeks, the same way Press Releases will rank for a temporary period and then disappear.
Google seems to filter results automatically, and it does not use the archive.org, nor does it date the content. Instead, it places the content on what it considers to be the more authoritative site, or in some cases the network that is interlinked and has the content so that the trust factors may be outweighed by a network size or authority indicator, rather than the actual age or the originator of the content.
Keep in mind these are observations and NOT definitive facts. The only company that can give factual data is the Search Engine.
Some savvy marketers believe that since Google entered the size wars with Yahoo, they used supplemental pages as part of the number of sites/pages indexed to appear larger, and so they indexed multiple versions of articles and blog feeds to appear bigger. Possibly realizing this was the wrong approach, the belief from a few marketers we spoke to, is that Google created two updates (Jagger and Big Daddy) to resolve this data issue. But all these theories may just be as true as any conspiracy theory about JFK and may or may not have any weight or fact behind them, but since this is a blog, we need to add the content. “WE MAKE NO CLAIM THAT THIS IS TRUE OR FALSE”. We do claim this is an original article and any copy should not outrank this. 🙂
The problem with duplicate content filters is that they appear to be trigger happy, sometimes inaccurate and tested publicly since no search engines have a known Beta Team. So blogs that use feeds, article websites, articles and companies that syndicate any content are placing their sites at risk in Google.
A recent interview by Chris Parillo with Danny Sullivan of Search Engine Watch expresses this concern. Hear it here…
It’s now believed, now more than ever, that any competitor can hurt anyone’s rankings with a bit of work using the duplicate content filters and scraper sites to trip this filter, combine it with several 302 redirects and it can be an issue. These are the weaknesses of Google: 302 redirects and Duplicate Content.
In the audio link above, you can hear that there is an overall concern of webmasters that have felt the duplicate content filters, and those that fear it. No one feels comfortable about it, and in our opinion and from experience, webmasters should seriously fear these potential flaws.
What Steps can you take?
- Avoid Syndication (yes, we are syndicating this blog – oops) but the days of syndication may be gone soon if the filters are removing all the content.
- Never syndicate an article and place it on your website (None of our clients are allowed to – better safe than sorry – there’s no return from sorry).
- Use Copyscape to automatically check the Google Cache for copies. If you find duplicate content submit a DMCA complaint to all search engines immediately. The longer you wait the more damage it can do to your site, and this damage may not be reversible.
- Send your content to the Library of Congress and Register your copyright (Get a Registration Number!), placing a copyright on your website is not good enough, and meta copyrights are 100% worthless, its a dead tag never honored.
- Use Absolute links (http://…) and not relative links. It makes it slightly harder for most scrappers to steal your content. This won’t stop the outsourcing companies from stealing it.
- Never use copywriters that are outsourced to countries that do not speak native English, 90% of them cannot write it either, and they may be stealing it and damaging your site and the site they stole it from. There are know companies in other countries that use the term “Data Mining” as stealing highly ranked content for rankings, a common practice.
As for Yahoo and MSN, you currently do not need to worry about 301 or 302 redirects, nor do you need to worry about the duplicate filters wiping your sites out of those Search Engines. They appear to be more advanced in their handling of these issues.