Edited for 2023:
It seems we keep hearing tons of rumors and concerns about Google’s handling of Duplicate Content. As no one knows the facts, here is what we have observed.
Duplicate content ON or OFF your website is bad mainly on Google.
Article marketers will claim it is only bad on your site; this is not a fact at all and is, more likely, a misconception. Evidence shows that a syndicated article or news release will start out with several hundred indexed listings, only to wind down to a remaining 3-5 indexed listings after a few weeks, the same way Press Releases will rank for a temporary period and then disappear. In 2023, there is no definitive penalty for duplicated or syndicated content. However, if you do syndicate content on your site, and your site is not very authoritative, you may actually not be credited for the content. In other words, it may rank better on a syndicated authoritative site than your own site. This is a reason building authority is important.
Google seems to filter results automatically, and it does not use archive.org, nor does it date the content it found on one site vs. another. Instead, it will rank the content on what it considers to be the more authoritative site higher.
Keep in mind these are observations and NOT definitive facts. The only company that can give factual data is the Search Engine, and these algorithms are kept secret, with only general information given to marketers.
In the early days of Search, Google entered the “size wars” with Yahoo. They used supplemental index pages as part of the number of sites/pages indexed to appear larger, and so they indexed multiple versions of articles and blogs to appear bigger. Possibly realizing this was the wrong approach, Google created two updates in the early 2000s called Jagger and Big Daddy, which we believe the main purpose was to resolve this quality issue.
The problem with duplicate content filters is that they appear to be trigger-happy and sometimes inaccurate. So blogs that use feeds, article websites, articles, and companies that syndicate any content are placing their sites at risk in Google for losing their content to a duplicate.
A 2006 interview by Chris Parillo with Danny Sullivan of Search Engine Watch expresses this concern. Hear it here…
There are negative SEO theories that any competitor can hurt anyone’s rankings with a bit of work using duplicate content filters and scraper sites to trip this filter, combine it with several strategic 302 redirects, and it can be an issue. 302 redirects and Duplicate Content can cause issues in website rankings, so its important to keep your site as authoritative as possible.
In the audio link above, you can hear that there is an overall concern of webmasters that have felt the duplicate content filters impact rankings. No one feels comfortable about it, and in our opinion and from experience, webmasters should seriously fear these potential flaws. As much as Google continues to state there is little a competitor can do, we see competitors using shady tactics and even reporting to Google. Google Business Profiles (GBP) are the most susceptible to this. In 2022 and 2023, our competitors had legitimate reviews removed from our GBP. Some even went as far as changing our listings and removing what they could on third-party websites.
What Steps can you take?
- Build Authority, Power your site up with modern link building strategies and outreach.
- Monitor Syndication, make sure you’re getting the credit for any content that is picked up with a link back to you.
- Avoid syndicating content that you also place on your website.
- Use a content monitoring service like Copyscape to automatically check the Google Cache for copies. If you find duplicate content submit a DMCA complaint to all search engines immediately. The longer you wait, the more damage it can do to your site, and this damage may not be reversible – Historically, Google is not always forgiving.
- Register your Copyright. Send your content to the Library of Congress and Register your copyright (Get a Registration Number!). Placing copyright notices on your website is not good enough, and meta copyrights are 100% worthless. It’s a dead tag never honored,
- Use Absolute links (http://…) and not relative links in your content. This makes it slightly harder for scrappers to steal your content.
- Use Known Content Writers. Never use copywriters that are outsourced to countries that do not speak native English, 90% of them cannot write it either, and they may be stealing it and damaging your site and the site they stole it from. There are known companies in other countries that use the term “Data Mining” as stealing highly ranked content for rankings, a common practice. In 2023, they are using AI to write content, so using known writers is the smartest option.
In 2023 the duplicate content filter is less of a concern. However, if you find your site not ranking well for good terms, then you may want to see if your content has been stolen and if it’s excessive.