How to Prevent Duplicate Content Penalties

Websites with a lot of duplicate content get penalized by search engines. After all, their goal is to find unique content, not content that is the same throughout the internet, or even the same throughout your website. Many webmasters will unknowingly have a lot of duplicate content on their website, often because of simple technical mistakes. Here is how to avoid duplicate content.

Where Does Duplicate Content Come From?

Perhaps the most common source of duplicate content is from archive-type pages. For example, if you run a blog, the main content is in your posts. But you also have your category pages and your archive pages. Each of your category pages has a 200-word snippet from the article as a teaser for viewers to see.

Unfortunately, that means a single page of your category pages has as much as 1500 words of duplicate content, copied straight from your articles. The same is true of your archive pages. And any given blog may have dozens of pages of category and archive pages. Less common sources of duplicate content include hiring unscrupulous outsourcers who copy and paste other people’s work, along with webmasters who republish other people’s articles with the original author’s permission (legal, but not good from a search engine optimization [SEO] perspective.) So how do you prevent duplicate content?

Learn to Use Your Meta Robots or Robots.txt

Your meta robots tag allows you to specify on a certain page whether you want the page indexed or not. If your page contains duplicate content, just tell the search engines not to index it. Robots.txt resides on your top level directory and tells the search engines which pages it can and can’t index. Basically, meta works on a single page and robots.txt directs the whole site. Learn to use both if you often need to block search engines out of certain pages. Learning to use meta robots and robots.txt is especially important if you are using HTML or PHP to build your site, rather than WordPress.

Using WordPress Plug-Ins

If you are using WordPress, a lot of the duplicate content issues can be remedied through plug-ins. Most of the SEO plug-ins will handle the major duplicate content issues for you. Make sure you get one that allows you to “noindex” your category and archive pages, as well as automatically make all your links canonical.

Scan All Content with Copyscape

If you regularly outsource content, get in the habit of scanning your content with Copyscape. Copyscape will scan the internet for copies of your content and report back to you if they have found any matches, along with how close the match was. Though you will seldom run into an issue with outsourcers copying content outright, if it does happen it could ruin your entire website; therefore, it doesn’t hurt to be careful.

Duplicate content is preventable. It doesn’t take a lot of effort to noindex the right pages and scan your content with Copyscape. Get in the habit of performing these two procedures and you will protect your rankings in the long run.

Share

Author: jm

Joan Mullally has been doing business online for more than 20 years and is a pioneer in the fields of online publishing, marketing, and ecommerce. She is the author of more than 200 guides and courses designed to help beginner and intermediate marketers make the most of the opportunities the Internet offers for running a successful business. A student and later teacher trainee of Frank McCourt’s, she has always appreciated the power of the word, and has used her knowledge for successful SEO and PPC campaigns, and powerful marketing copy. One computer science class at NYU was enough to spark her fascination with all things digital. In her spare time, she works with adult literacy, animal fostering and rescue, and teaching computer skills to women.