|
Bandwidth theft, hotlinking, file leeching, bandwidth leeching,
external linking, remote linking, deep linking and direct linking
are all words and phrases used to describe a single problem faced
by many Webmasters. They describe the practice of building Web pages
that contain unauthorized content links to files hosted by another
site.
Notice that I said content links and not navigation links that
lead to another site. Content links are file references that the
browser fetches to draw the page such as images, style sheets, scripts
or even complete Web pages that are rendered within a frame. In
other words, these are embedded content or embedded objects within
an HTML page.
The result of hotlinking is that the offending site is able to
present its pages without paying for the bandwidth needed to serve
up the stolen content. The victim’s site ends up paying the
bandwidth expense for serving up the files without gaining any page
views. Many Webmasters would not mind if an image were copied and
hosted by another site, especially if permission was sought in advance.
The objection in the case of hotlinking is paying bandwidth bills
for someone else's benefit.
There are two levels at which you can apply controls to prevent
hotlinking. One option is to control it at the Web server level.
In Apache, this is typically implemented using mod_rewrite, while
in iis this would be implemented using an isapi filter. The other
possibility is to use a scripting facility such as apache + php
or iis + asp to control access to the resources to be protected
from hotlinking. Whatever bandwidth protection tool or technique
is picked to combat hotlinks, the task remains the same. First,
decide if the request is a permitted legitimate link or a hotlink
originating from another site and second, send the file or drop
the hotlinked request. Studying the solutions will show that the
mechanisms that are used are the http-referrer header, browser cookies,
dynamic session identifiers and dynamic link manipulation.
Http-referrer is an http request header sent by the browser that
tells the server or script what site and page contained the current
request. There are certain notable exceptions that must be accommodated.
The http-referrer value will be blank if the request was a URL type-in,
if an intervening proxy server deleted it, if the request is an
http:// reference originating from an https:// originator, if the
request is being masked by Internet privacy software or if the request
is being modified by browser privacy settings. The http-referrer
can also be a nonsensical string if it is being masked by Internet
privacy software or browser privacy settings.
The biggest security hole in depending on the http-referrer header
is that a blank referrer must almost always be permitted in the
server settings. This is necessary to accommodate legitimate users
who are reaching the site through normal means but presenting a
blank referrer string. In this scenario it is trivial to create
a Web page that will always present a blank referrer. One method
is to use JavaScript to write the image links at the client browser.
A second method is to do a meta-refresh. Either method will cause
a blank referrer to be sent to the server.
A browser cookie is just another http request header that returns
information which the server has previously requested the client
to store and return with every http request. When client cookies
are available, they can be a very reliable tracking device.
However, as concerns for privacy grow on the Internet, increasing
numbers of users are using inaccurate http-referrer headers and
turning off client browser cookies. Of course, this reduces the
effectiveness of depending on these features as identifiers for
bandwidth protection purposes.
Dynamic session identifiers and dynamic link manipulation refer
to the technique of modifying parts of URLs for each unique client.
The limiting factor is the requirement that the pages containing
such links cannot be static HTML. Each page request will need to
have been uniquely created by a scripting engine such as PHP, ASP,
ASP.net, ColdFusion or Java. The server works harder, the user cannot
cache the page and search engines may have a hard time crawling
these pages if query strings are involved.
There is now an isapi filter-based product for iis that overcomes
the difficulties normally inherent in dynamic URL manipulation.
The company is aptly called coldlink.com, and they have an in-depth
bandwidth protection demonstration
site where you can see their product prevent hotlinking on an
iis server. It protects any kind of file content and works with
both static and dynamic HTML pages without depending on cookies
or http-referrer headers.
Internet searches on Google, AllTheWeb, Yahoo or MSN that will
yield useful updated instructions and source code for implementing
bandwidth protection include:
· hotlink .htaccess
· hotlink mod_rewrite
· hotlink php symlink
· hotlink webmasterworld
· leechblocker
As an aside, some Webmasters have also been making use of non-technical
means by enforcing their legal rights. This is particularly true
if the offending action is within the parameter of the Digital Millennium
Copyright Act (DMCA), and is specifically mentioned. Success in
following this avenue can be mixed. The first step is a strongly
worded cease and desist letter. In some jurisdictions, a formal
cease and desist letter is a necessary step to further prosecution.
A particularly nice touch is to have your lawyer send the initial
email and include your ISP.
by Bob
This article was brought to you in conjunction with YnotMasters
Network |