Use varnish to avoid hot linking or image leeching

Written by
Date: 2011-07-24 10:36:30 00:00


Introduction

Hot Linking

From Wikipedia we have:

The technology behind the World Wide Web, the Hypertext Transfer Protocol (HTTP), does not make any distinction of types of links—all links are functionally equal. Resources may be located on any server at any location. When a web site is visited, the browser first downloads the textual content in the form of an HTML document. The downloaded HTML document may call for other HTML files, images, scripts and/or stylesheet files to be processed. These files may contain tags which supply the URLs which allow images to display on the page. The HTML code generally does not specify a server, meaning that the web browser should use the same server as the parent code (<img src="picture.jpg" />). It also permits absolute URLs that refer to images hosted on other servers (<img src="http://www.example.com/picture.jpg" />).

This is good, if you own the server where the objects are stored, but if not, what you are doing is using someone else bandwidth.

And if somebody is using your bandwidth, you will not like it, and something needs to be done about it.

We’ll see now how to stop or prevent other from hot linking your content.

Block access to your media files from non authorized sites

We’ll use varnish, to block hot linking or image leeching.

This way, we’ll block the bandwidth thief before he really reach our web sever.

Here the code you may add to your varnish vcl

 sub hot_link {
        if (
                (req.http.host == "www.site1.com"
                ||
                req.http.host == "www.site2.com"
                ||
                req.http.host == "www.site3.com")

                &&
                req.url ~ "\.(js|css|jpg|jpeg|png|gif|gz|tgz|bz2|tbz|mp3|ogg|swf)$" 
                &&
                (req.http.referer 
                && (req.http.referer !~ "^http://www.site1.com/" && req.http.referer !~ "^http://www.site2.com/" && req.http.referer !~ "^http://www.site3.com"))
                )
                {
                        error 403 "No hot linking please";
                }
}

What is this code going to do?

  • First we define which host we are using as virtual hosts behind varnish, in this case I’m defining three you may add or take as you wish.
  • Next, we define the extension of files we do not want to be served unless our own site call them
  • Finally, we define the referrers that are authorized to call our content and load it in-page.
  • One last thing, we define what to do with those who do not comply with the requirements, in this case we are sending a 403 “forbidden error” with the message, “No hot linking please”.

You may change the behavior of this last call by sending other error or redirecting those calls to your homepage for example.

Once we have this sub-routine, we can call it in sub vcl_recv section.

Example of vcl_recv section

sub vcl_recv {
    call hot_link;

    ……
    #The rest of your rules come here
    …… 
}

Redirecting to your home page

If you want to redirect to an special page, instead of error 403, use error 302, and then create a vcl_error section like this:

sub vcl_error {
    if (obj.status == 302) {
        set obj.http.Location = "http://site.com/hot-linking.html";
        #set obj.status = 302;
        #return(deliver);
    }
}

Conclusion

Bandwidth is not cheap, and you should take care of it, there will always be not fairly people out there, and you should take care of them. This is just one method of doing this, there are others, and a lot of them more complex with more options, but this is a good place to start.