Use Amazon CloudFront Origin Pull - Custom Origin - and Avoid Duplicate Content

Written by
Date: 2012-06-22 11:35:00 00:00


In order to improve SEO of your site, it should be fast. Google and Yahoo recommends you to improve your site speed as much as possible.

One way to improve the speed of the site, is by using a CDN, normally CDNs were for static assets, and not for the whole site, which normally is Dynamic.

Now with Amazon CloudFront, you can use Origin Pull or Custom Origin as it is know in Amazon, to deliver dynamic generated content via the CDN.

The setup is really easy and it is well documented, just google for "how to use cloudfront custom origin".

So, say you have your site sitting at: http://origin.yoursite.com. And you want your visitors to reach you at http://www.yoursite.com

What you need to do, is to configure your server (Nginx in my case), to respond to origin.yoursite.com, configure Amazon CloudFront to use that name as the origin. Then, create a CNAME for www.yoursite.com pointing to the name of your cloudfront instance.

That is it, you are done, but one more thing is missing, now you can access your content (the same content) via:

  • http://www.yoursite.com
  • http://origin.yoursite.com
  • http://some.url.cloudfront.com

Now let's get rid of the second one. In the nginx configuration file, in the server section add this block:

    if ($http_user_agent !~ "Amazon CloudFront") {
            rewrite ^/(.*) http://www.yoursite.com/$1 permanent;
    }

Now, if someone is trying to access your content via the http://origin.yoursite.com and it is not CloudFront, will be redirected to the real (canonical) url of your content. Easy right?

You should be able to do the same thing using Apache and .htaccess