Heading into the new year, SEOs will be looking to try out new tactics and implement best practices for webpages to deliver a better experience to visitors.
One factor that can have a detrimental impact on the user experience is duplicate content, which is a thorny issue that Google has to contend with regularly when indexing and ranking pages for search.
What is duplicate content?
Google explicitly advises to “avoid creating duplicate content” in the ‘Advanced SEO’ section of its guidelines on the Google Search Central hub.
Google defines duplicate content as “substantive blocks of content” that can be present on one site or across different domains and are a complete match or close match to content located elsewhere.
While it may appear that duplicate content is malicious in nature, and is merely copied from one page to another, this is not always the case.
There are several examples that Google highlights where duplicate content can be an unforeseen issue, such as product store items that are linked across different URLs and webpages that have printer-only versions.
How can it be used maliciously?
While there are instances where duplicate content is not deceptive in origin, it can also be used deliberately for web traffic gains and to manipulate search rankings.
When it is used in this way, Google says that it often leads to a poor user experience as searchers are often presented with pages that show content repeated across a set of search results. This goes against Google’s own efforts to provide results that show distinct and unique information.
Can duplicate content be penalised by Google?
Using duplicate content is not recommended by Google, but there are not any direct penalties related to the practice.
“Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar. Mostly, this is not deceptive in origin,” Google says in its latest Webmaster Guidelines.
It appears, then, that duplicate content is common across the web and that in the majority of cases, it is not used to gain an advantage.
However, the search giant stresses that the rankings of sites with this content “may suffer” if it finds that there is an attempt to deceive users or manipulate rankings, as tweaks and adjustments to indexing will be made accordingly.
While duplicate content may not always be malicious, there is rarely a time when it is actively helping your cause, so you should always be aiming to avoid it.
How can duplicate content impact search?
Duplicate content can lead to the wrong page being surfaced in search results, general indexing problems, a drop-off in key metrics such as traffic or rankings, and wider issues related to search and your strategies. This is far from ideal and it can actively harm your SEO efforts.
What can I do to avoid it?
Google will always reward unique content that provides value to the reader, so focusing on crafting engaging copy, either with the help of an agency or internal creative team, should be a top priority and the primary guard against duplicate content.
Another simple best practice is to never add or copy any content found on other pages in exactly the same way as it was originally used, as this will quickly lead to a duplicate content red flag.
However, the body of content – the written part in a blog, article or product page – is unlikely to cause any problems if you are careful and meticulous with how you create and publish content.
Bigger problems can manifest when, for example, templating content, restructuring your site or using UTM tags to support search analytics. These practices can lead to duplication, even if you did not intend it.
301s and consistent internal linking
Google has outlined its own recommendations to avoid duplicate content. It advises webmasters to make use of 301 redirects after restructuring a site to “smartly” point Googlebot (Google’s web crawler) and users to the right content.
It also recommends being “consistent” with internal linking, which is a legitimate and worthwhile strategy for increasing E-A-T (short for expertise, authoritativeness, trustworthiness). This means always using exactly the same URL when linking to an ‘About’ page or ‘Home’ page.
Canonical tags and careful syndicating
Another way that you can combat duplication is by using canonical tags for pages where appropriate. The rel=canonical HTML code informs Google that a particular page is the primary version, which is very useful if you have the same content on web and print pages, and mobile and desktop.
Google’s John Mueller notes: “It’s quite common for a website to have multiple unique URLs that lead to the same content … ideally we wouldn’t even run across any of these alternate versions which is why we recommend picking one URL format and using it consistently across your website.”
Duplicate content can also be an issue if content syndication is part of your marketing strategy. Syndication is when content is republished verbatim on different websites, a process that can help to amplify the reach of your original work.
Google urges marketers to “syndicate carefully” by making sure that each site that does republish your content actually includes a link to the original article. Google also says that it “always” surfaces the version that is best suited to a user’s search query, even if this is not your preferred version.
General best practices
Google also recommends using top-level domains for country-specific content, reducing boilerplate repetition on pages and placeholder pages that lack any “real content”. It is also a good idea to keep an eye out for any duplicate URLs that may pop up on your website due to structural element issues.
As you can see, creating unique value-added content is just one part of the puzzle when attempting to combat duplicate content. This is why it is crucial to conduct regular audits, address any issues when they arise, and implement best practices moving forward to mitigate the potential negative impact to your site and your SEO efforts.