« Yet Another Duplicate Content Vulnerability Hits WordPress, Movable Type Blogs (Part 2) | Home | WordPress Security: It’s Getting Worse, Not Better »
Matt Cutts Publishing Duplicate Content on His WordPress Blog
By Greg | August 6, 2007
(Or…SEO Experts Still Getting it Wrong on WordPress Duplicate Content…) Following my posts last week about the latest duplicate content vulnerability in the WordPress blogging platform, it didn’t take long for someone to point out that Matt Cutts is now officially publishing duplicate content on his blog — and so are bundles of experts in the SEO community (including many who have prognosticated extensively on curing WordPress of duplicate content issues).
If you’ve read my posts from last week on the latest duplicate content bug to hit WordPress and Movable Type blogs (“Yet Another Duplicate Content Vulnerability Hits WordPress, Movable Type Blogs (Part 1)”), you’ll know that pretty much everyone using these two popular blogging platforms is at risk of unwittingly publishing duplicate content.
Since Matt Cutts over at Google is one of the best-known critics of publishers who provide the same content on multiple URLs, and since the phrase ‘duplicate content’ pops up pretty frequently on Matt’s blog, you have to admit it’s kind of funny to find that you can still visit one of Matt’s recent posts at any of these three URLs:
- http://www.mattcutts.com/blog/webmaster-console-features/
- http://www.mattcutts.com/blog/webmaster-console-features/2001/
- http://www.mattcutts.com/blog/webmaster-console-features/314159265359 /
Of course, the problem isn’t limited just to three copies of the same post: there are potentially infinitely many copies. Look out, Google! Better penalize Matt! And Matt, you really should have some machine-readable indicator to flag which of your content is duplicated — because, I mean, you can’t expect little old Google to figure out this kind of thing on its own, can you?
And some of the best-known prognosticators about how to ‘fix’ the previous WordPress vulnerabilities (which, as I pointed out in the earlier article, were not WordPress vulnerabilities at all, but mainly reflected poor design choices on the part of theme authors and poor permalink choices on the part of blog publishers) are also afflicted. For example, here are a few examples of popular blog posts about fixing WordPress duplicate content ‘issues’, all of which are affected by the current bug as of this writing:
- “WordPress, Duplicate Content, and Wrong SEO Plugins”:
- “Surprise. Your Wordpress Site Isn’t Search Engine Friendly”:
- “How to Make a WordPress Blog Duplicate Content Safe” (this one only duplicates the comments section of each post):
- “Wordpress Duplicate Content Issues & Solutions”:
- “Critical SEO Tip for Wordpress”:
- “Fighting Duplicate Content On Wordpress”:
And even the blackhat SEO experts are still running vulnerable blogs — ironically enough, including Greywolf, who last year posted an article called “How to Create Duplicate Content on Someone Else’s Wordpress Blog”:
- http://www.wolf-howl.com/seo/duplicate-content-wordpress-blog/
- http://www.wolf-howl.com/seo/duplicate-content-wordpress-blog/2001/
- http://www.wolf-howl.com/seo/duplicate-content-wordpress-blog/314159265359/
Oops, sorry Greywolf — looks like that would be how to create duplicate content on someone else’s blog…
Even all-around good guy Aaron Wall, whom I mentioned in the article last week, hasn’t mentioned this problem in his recent article “Customizing Blog Page Titles & Fixing Common Blogger Template SEO Errors”:
- http://www.seobook.com/archives/002380.shtml
- http://www.seobook.com/archives/002380.shtml/2001/
- http://www.seobook.com/archives/002380.shtml/314159265359/
And the popular blogger on all things blackish and hattish, Quadszilla, is still showing the vulnerability on his recent post on duplicate content problems “Do we call it Googlewashing?”:
- http://seoblackhat.com/2005/10/04/do-we-call-it-googlewashing/
- http://seoblackhat.com/2005/10/04/do-we-call-it-googlewashing/2001/
- http://seoblackhat.com/2005/10/04/do-we-call-it-googlewashing/314159265359/
So, that’s where we are right now…so much duplicate content, so little time…
Find Additional Information
Learn more with a Google search specifically on the ‘Matt Cutts: Gadgets, Google, and SEO’ site:





















August 7th, 2007 at 12:41 pm
This kind of behavior just affects the whole internet. Like,
http://whereelsetoputit.com/blog/
http://whereelsetoputit.com/blog/?hello=1
http://whereelsetoputit.com/blog/?hello=2&world=314159
etc…
I would be really surprised if the smart people at Google would consider this a duplicate content issue.
August 7th, 2007 at 2:08 pm
Howdy Ozh, and thanks for stopping by!
Yes, there are definitely lots of other ways to create duplicate ways of hitting the same content. However, there are at least two important (IMHO!) things to keep in mind:
As an example of the latter, I can tell with a quick check at Google right now that planetozh.com has about 7,230 pages in the Google supplemental index. (Given their recent announcement to drop the supplemental label, and to drop support for the most common types of search queries designed to return supplemental pages, there’s no telling how much longer this information might remain available.) If you look through them, you’ll see that many of your 7 thousand pages of supplemental content are in a /bookmark/ or /tags/ directory, while others are in date-based archives containing text that is also available via individual article permalinks.
In other words, while I’m not too worried about random people appending random parameters to URLs, I am worried about the external world seeing a whole bunch of structurally different URLs, apparently referencing different directories, all containing the same content. That is what is happening with your blog right now, and unfortunately the “smart people at Google” emphatically do consider this a duplicate content issue.
(By the way, you’ll see lots of supplemental pages for this site, too, as a result of a change in my permalink structure awhile back; fortunately, all the older pages which Google thinks are the definitive versions are 301-ed to the new versions.)
All the best,
Greg
October 8th, 2007 at 8:59 pm
1) My blog as archieves 2) categories, in which 1 post might be under multi categories. I got the newest all in one Seo plug and it allows you to exclude archieves, categories or others. However, my archieves are the pages with the best PR. If I exclude these then I lose a lot. Further I post on my blog everyday so the pages I think will look different under categories and archieves correct? What is the best thing to apply no follow to if anything? Thanks, Mark