How to Fix WordPress Duplicate Content Issues

WordPress generates duplicate content from tag archives, paginated pages, and URL variations. How to identify all the sources and fix them.

Dobromir Dechev
Dobromir WordPress agency owner

Quick answer

WordPress duplicate content is created by tag archives, category pagination, author pages, the ?replytocom parameter, and URL variations (www vs non-www, HTTP vs HTTPS) — fix each source by setting canonical URLs with an SEO plugin like Rank Math and noindexing low-value archive pages.

Duplicate content does not cause penalties in the way many SEO guides imply, but it does dilute PageRank across multiple URLs, confuses Google about which version to rank, and wastes crawl budget on pages that add no value.

WordPress generates duplicate content from several sources - some are native WordPress behaviour, some are created by plugins. Here is how to identify and resolve each one.


Identify your duplicate content

Before fixing anything, crawl the site and find all the URLs being indexed. Use one of:

  • Screaming Frog (desktop app, free up to 500 URLs)
  • Google Search Console > Coverage tab (shows all indexed URLs)
  • Sitebulb

Look for:

  • Multiple URLs resolving to identical or very similar content
  • Large numbers of low-content or thin archive pages
  • URLs with the same content but different query parameters

Issue 1 - WWW vs non-WWW

Both https://yourdomain.com and https://www.yourdomain.com serving the same content is one of the most basic duplicate content problems.

Fix: Ensure only one version is accessible and the other 301-redirects to the canonical version. Set a consistent canonical in wp-config.php:

define( 'WP_HOME',    'https://yourdomain.com' );   // non-www
define( 'WP_SITEURL', 'https://yourdomain.com' );

Set up a 301 redirect for the non-canonical version in Nginx:

# Redirect www to non-www
server {
    listen 443 ssl;
    server_name www.yourdomain.com;
    return 301 https://yourdomain.com$request_uri;
}

Or to prefer www:

server {
    listen 443 ssl;
    server_name yourdomain.com;
    return 301 https://www.yourdomain.com$request_uri;
}

Issue 2 - HTTP vs HTTPS

If both http://yourdomain.com and https://yourdomain.com are accessible, they create duplicate content.

Fix: In Nginx, redirect all HTTP to HTTPS:

server {
    listen 80;
    server_name yourdomain.com www.yourdomain.com;
    return 301 https://yourdomain.com$request_uri;
}

Issue 3 - Trailing slash inconsistency

/blog/post-title/ and /blog/post-title (without trailing slash) should not both be accessible.

WordPress uses trailing slashes by default. Verify your caching plugin and server configuration redirect the non-trailing-slash version:

In Nginx:

# WordPress handles this via rewrite rules, but add explicit redirect for safety
rewrite ^([^.]*[^/])$ $1/ permanent;

Most WordPress setups handle this correctly via the permalink rewrite rules. Check with Screaming Frog if both versions appear in your crawl.


Issue 4 - Tag archives duplicating category content

WordPress tag archives often contain the same posts as category archives, sometimes with identical content and only different URLs:

  • /category/guides/ - posts in the Guides category
  • /tag/guides/ - posts tagged "guides"

If tag archives are used casually (posts tagged with the same terms as their categories), they produce thin near-duplicate content.

Fix option A - Noindex tag archives:

// Add to functions.php
add_action( 'wp_head', function() {
    if ( is_tag() ) {
        echo '<meta name="robots" content="noindex, follow">';
    }
});

Or in Yoast SEO: Settings > Search Appearance > Taxonomies > Tags > Show in search results: Off.

In Rank Math: Titles & Meta > Tags > Robots Meta: No Index.

Fix option B - Use canonical tags:

In your SEO plugin, ensure tag archive pages have canonical tags pointing to the canonical URL. If a tag archive is paginated, each paginated page should have a canonical pointing to page 1 (or use self-referencing canonicals with proper rel=prev/next).

Fix option C - Block via robots.txt:

Disallow: /tag/

This prevents Google from crawling tag archives entirely. Use this only if tags provide no navigation value.


Issue 5 - Paginated archives

/category/guides/ and /category/guides/page/2/ may contain similar content if the first post on page 2 is the same type as the last post on page 1.

Fix: Use rel=prev/next in the HTML head. Most SEO plugins (Yoast, Rank Math) add these automatically. Verify they are present:

<link rel="prev" href="https://yourdomain.com/category/guides/" />
<link rel="next" href="https://yourdomain.com/category/guides/page/3/" />

Google understands these signals to treat paginated pages as a series rather than duplicates.

Alternatively, if category archives are short, increase the posts per page to show all posts on one URL, eliminating pagination entirely.


Issue 6 - Print-friendly page versions

Some older themes and plugins create print-friendly versions of pages (/post-title/?print=1 or /print/post-title/). These are exact duplicates.

Fix: If the print version is generated by a plugin, check its settings for a "noindex print pages" option. Otherwise add:

add_action( 'wp_head', function() {
    if ( isset( $_GET['print'] ) ) {
        echo '<meta name="robots" content="noindex, nofollow">';
    }
});

Or block in robots.txt:

Disallow: /*?print=

Issue 7 - Author archives for single-author sites

If your site has one author, the author archive (/author/yourname/) is a near-duplicate of the main blog archive.

Fix for single-author sites:

// Redirect author archive to homepage
add_action( 'template_redirect', function() {
    if ( is_author() ) {
        wp_redirect( home_url(), 301 );
        exit;
    }
});

Or noindex author archives via the SEO plugin.


Issue 8 - Feed URLs

WordPress generates RSS and Atom feeds for posts, categories, tags, and authors. These are technically duplicates of the archive pages in a different format.

Feeds are legitimate and should not be noindexed (they are how readers subscribe). However, if Google is spending crawl budget on obscure feed variants (/feed/atom/, /comments/feed/), you can restrict crawling of feed variants in robots.txt:

Disallow: /feed/atom/
Disallow: /comments/feed/
Disallow: /category/*/feed/
Disallow: /tag/*/feed/

Keep the main /feed/ accessible.


Verify fixes with canonical tags

After implementing fixes, verify that canonical tags are correct on all important pages. In your SEO plugin, the canonical should be the "true" URL - the one you want Google to index and rank.

Check with:

curl -s https://yourdomain.com/some-post/ | grep 'rel="canonical"'

The output should show the exact URL you want indexed, with the correct protocol, www preference, and trailing slash.


Frequently Asked Questions

Does WordPress automatically create duplicate content?
Yes. WordPress natively generates multiple URLs that display identical or near-identical content: the main post URL, tag archives, category archives, author archives, date archives, and paginated versions of each. Append ?replytocom=1 or ?s= to any URL and you get more variants. None of these are penalties by default, but they dilute PageRank and can confuse Google about which URL to rank.
How do I fix duplicate content in WordPress without an SEO plugin?
You can add a canonical link tag manually in your theme's header.php using wp_get_canonical_url(), and block archive pages with a Disallow in robots.txt. However, a dedicated SEO plugin like Rank Math or Yoast handles this far more reliably — they add correct canonical tags automatically, let you noindex thin archives with a checkbox, and handle edge cases like paginated posts that manual code often misses.
Should I noindex WordPress tag pages to fix duplicate content?
Usually yes, unless your tag pages have substantial unique content and meaningful search traffic. Most WordPress tag archives are thin — a few excerpts from posts that are fully indexed elsewhere. Setting them to noindex (robots meta noindex) removes them from the index, consolidates PageRank onto the canonical posts, and reduces crawl budget waste. Do this in Rank Math under Titles & Meta > Tags, or in Yoast under Search Appearance > Taxonomies.
Can www and non-www versions of a WordPress site cause duplicate content?
Yes. If both www.yourdomain.com and yourdomain.com are accessible and serving the same content, Google sees two versions of every page. Fix this by setting a canonical domain in WordPress under Settings > General (use either www or non-www consistently), adding a 301 redirect in .htaccess from the unused version to the preferred one, and verifying the preferred version in Google Search Console.

Was this article helpful?