Column: Entity Tags: Friend or Foe?

By Tim Quax on 07 june 2010

The last few weeks I dived into the world of website speed optimization. My current knowledge about speed optimization wasn't sufficient anymore as one of the websites I manage the servers for became slower and slower as the number of visitors per second increases. I stumbled upon ETags; though it has it's catches.

The entity tag, or ETag, is part of the HTTP protocol. It is one of several mechanisms provided for cache validation. The ETag system has a very flexible validation model, it also allows a client to make conditional requests. Overall it allows your standard caching to be more efficient. Efficiency in caching results in a decrease in bandwidth and increase in speed.

When you're requesting a resource, the web server will return the resource along with its corresponding ETag header. The client is then able to cache the requested resource along with its ETag. When the client wants to retrieve the same resource again it will send its previously saved copy of the ETag along with the request in a "If-None-Match" header.

The server may now compare the sent ETag with the ETag for the current requested version of the resource. If the ETag values match (i.e. the resource has not changed) the server will send a "HTTP 304 Not Modified" response back to the client. The 304 status will make the browser use it's own cached version.


Disadvantages

Sounds really cool, but here's the drawback. ETags are constructed using attributes that make them unique to a specific server. Therein lies the problem; ETags won't match when a browser fetches the resource from one server and validates the ETag on the other. Since clustered environments are quite common in the World Wide Web of today, this is a real hassle.

Both ETags generated by Apache and IIS for the exact same resource won't match from one server to another. If your website is hosted on a single server, everything's all peachy. However, with multiple servers, your clients get slower pages since you're consuming more bandwidth, your servers have a higher load, and proxies are not caching efficiently.

And even if your resources contain an "Expires" header set to somewhere far in the future, a conditional GET request is still made with every refresh.


Conclusion

I think it's best to stay away from ETags when you're not going to take the advantages it can give you. However, if you can take your time configuring it exactly how you want it and your current situation requires, it can be a powerful addition to your current speed optimizations.

This is a column from Tim Quax's blog. For more of his posts, check out his site.



React on this article







Enter the code here: