<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Matt Cutts Publishing Duplicate Content on His WordPress Blog</title>
	<atom:link href="http://whereelsetoputit.com/blog/2007/matt-cutts-publishing-duplicate-content/feed/" rel="self" type="application/rss+xml" />
	<link>http://whereelsetoputit.com/blog/2007/matt-cutts-publishing-duplicate-content/</link>
	<description>Greg Mulhauser&#039;s random musings on life that don&#039;t seem to fit in anywhere else.</description>
	<lastBuildDate>Thu, 04 Mar 2010 09:26:06 +0000</lastBuildDate>
	
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: mark</title>
		<link>http://whereelsetoputit.com/blog/2007/matt-cutts-publishing-duplicate-content/#comment-356</link>
		<dc:creator>mark</dc:creator>
		<pubDate>Mon, 08 Oct 2007 20:59:38 +0000</pubDate>
		<guid isPermaLink="false">http://whereelsetoputit.com/blog/matt-cutts-publishing-duplicate-content/#comment-356</guid>
		<description>1) My blog as archieves 2) categories, in which 1 post might be under multi categories.  I got the newest all in one Seo plug and it allows you to exclude archieves, categories or others.  However, my archieves are the pages with the best PR.  If I exclude these then I lose a lot.  Further I post on my blog everyday so the pages I think will look different under categories and archieves correct?  What is the best thing to apply no follow to if anything?  Thanks, Mark</description>
		<content:encoded><![CDATA[<p>1) My blog as archieves 2) categories, in which 1 post might be under multi categories.  I got the newest all in one Seo plug and it allows you to exclude archieves, categories or others.  However, my archieves are the pages with the best PR.  If I exclude these then I lose a lot.  Further I post on my blog everyday so the pages I think will look different under categories and archieves correct?  What is the best thing to apply no follow to if anything?  Thanks, Mark</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Greg</title>
		<link>http://whereelsetoputit.com/blog/2007/matt-cutts-publishing-duplicate-content/#comment-23</link>
		<dc:creator>Greg</dc:creator>
		<pubDate>Tue, 07 Aug 2007 14:08:45 +0000</pubDate>
		<guid isPermaLink="false">http://whereelsetoputit.com/blog/matt-cutts-publishing-duplicate-content/#comment-23</guid>
		<description>Howdy Ozh, and thanks for stopping by!

Yes, there are definitely lots of other ways to create duplicate ways of hitting the same content. However, there are at least two important (IMHO!) things to keep in mind:

&lt;ol&gt;
	&lt;li&gt;Appending parameters to the end of a URL is categorically different from appending what look like &lt;strong&gt;directories&lt;/strong&gt; to the end of a URL. It is a no-brainer to identify parameters appended to a URL (anything that starts with &quot;?&quot;), and many search engines have done this for a long time -- e.g., automatically ignoring PHP session IDs.&lt;/li&gt;
&lt;li&gt;When we&#039;re talking about &lt;strong&gt;directories&lt;/strong&gt;, it&#039;s pretty hard to refute the empirical evidence that Google &lt;strong&gt;does&lt;/strong&gt; put this type of &#039;duplicate&#039; content in their supplemental index.&lt;/li&gt;
&lt;/ol&gt;

As an example of the latter, I can tell with a quick check at Google right now that planetozh.com has about 7,230 pages in the Google supplemental index. (Given their recent announcement to drop the supplemental label, and to drop support for the most common types of search queries designed to return supplemental pages, there&#039;s no telling how much longer this information might remain available.) If you look through them, you&#039;ll see that many of your 7 thousand pages of supplemental content are in a /bookmark/ or /tags/ directory, while others are in date-based archives containing text that is also available via individual article permalinks.

In other words, while I&#039;m not too worried about random people appending random parameters to URLs, I &lt;em&gt;am&lt;/em&gt; worried about the external world seeing a whole bunch of structurally different URLs, apparently referencing different directories, all containing the same content. That is what is happening with your blog right now, and unfortunately the &quot;smart people at Google&quot; emphatically &lt;strong&gt;do&lt;/strong&gt; consider this a duplicate content issue.

(By the way, you&#039;ll see lots of supplemental pages for this site, too, as a result of a change in my permalink structure awhile back; fortunately, all the older pages which Google thinks are the definitive versions are 301-ed to the new versions.)

All the best,
Greg</description>
		<content:encoded><![CDATA[<p>Howdy Ozh, and thanks for stopping by!</p>
<p>Yes, there are definitely lots of other ways to create duplicate ways of hitting the same content. However, there are at least two important (IMHO!) things to keep in mind:</p>
<ol>
<li>Appending parameters to the end of a URL is categorically different from appending what look like <strong>directories</strong> to the end of a URL. It is a no-brainer to identify parameters appended to a URL (anything that starts with &#8220;?&#8221;), and many search engines have done this for a long time &#8212; e.g., automatically ignoring PHP session IDs.</li>
<li>When we&#8217;re talking about <strong>directories</strong>, it&#8217;s pretty hard to refute the empirical evidence that Google <strong>does</strong> put this type of &#8216;duplicate&#8217; content in their supplemental index.</li>
</ol>
<p>As an example of the latter, I can tell with a quick check at Google right now that planetozh.com has about 7,230 pages in the Google supplemental index. (Given their recent announcement to drop the supplemental label, and to drop support for the most common types of search queries designed to return supplemental pages, there&#8217;s no telling how much longer this information might remain available.) If you look through them, you&#8217;ll see that many of your 7 thousand pages of supplemental content are in a /bookmark/ or /tags/ directory, while others are in date-based archives containing text that is also available via individual article permalinks.</p>
<p>In other words, while I&#8217;m not too worried about random people appending random parameters to URLs, I <em>am</em> worried about the external world seeing a whole bunch of structurally different URLs, apparently referencing different directories, all containing the same content. That is what is happening with your blog right now, and unfortunately the &#8220;smart people at Google&#8221; emphatically <strong>do</strong> consider this a duplicate content issue.</p>
<p>(By the way, you&#8217;ll see lots of supplemental pages for this site, too, as a result of a change in my permalink structure awhile back; fortunately, all the older pages which Google thinks are the definitive versions are 301-ed to the new versions.)</p>
<p>All the best,<br />
Greg</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ozh</title>
		<link>http://whereelsetoputit.com/blog/2007/matt-cutts-publishing-duplicate-content/#comment-22</link>
		<dc:creator>Ozh</dc:creator>
		<pubDate>Tue, 07 Aug 2007 12:41:16 +0000</pubDate>
		<guid isPermaLink="false">http://whereelsetoputit.com/blog/matt-cutts-publishing-duplicate-content/#comment-22</guid>
		<description>This kind of behavior just affects the whole internet. Like,
http://whereelsetoputit.com/blog/
http://whereelsetoputit.com/blog/?hello=1
http://whereelsetoputit.com/blog/?hello=2&amp;world=314159
etc...

I would be &lt;b&gt;really surprised&lt;/b&gt; if the smart people at Google would consider this a duplicate content issue.</description>
		<content:encoded><![CDATA[<p>This kind of behavior just affects the whole internet. Like,<br />
<a href="http://whereelsetoputit.com/blog/" rel="nofollow">http://whereelsetoputit.com/blog/</a><br />
<a href="http://whereelsetoputit.com/blog/?hello=1" rel="nofollow">http://whereelsetoputit.com/blog/?hello=1</a><br />
<a href="http://whereelsetoputit.com/blog/?hello=2&amp;world=314159" rel="nofollow">http://whereelsetoputit.com/blog/?hello=2&amp;world=314159</a><br />
etc&#8230;</p>
<p>I would be <b>really surprised</b> if the smart people at Google would consider this a duplicate content issue.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
