Is what they should have called a 3 page whitepaper. Instead they wrote a 100 page book and sold it for $30 ($20 on Amazon). I am talking about High Performance Websites. I don’t like to rant about books, I believe you can never read too many, but in this case paying that much money for a 2 hour read stretched even my credulity. And still I would have been happy if it was 100 pages packed full of awesome content. But, you guessed it, in this case, if you cut out the filler you could really fit all the useful info into about 3 pages (which would have made those 3 pages a really awesome resource).
Still I can’t be 100% critical, the book did teach me a few things I didn’t know before, and if you’re predominantly a back-end developer you will probably pick out a few useful tidbits as well. Still, after you finish it you kind-of wish you stopped reading after the table of contents, which would have covered 80% of the useful info in the book (but of course you wouldn’t know this until you’ve read through the whole thing). Luckily, since I’ve already been through it, I can save many other people the time and the money and create a summary – which is what this book should have been to start with.
The Summary
If you examine the HTTP requests for how a web page is loaded in a browser, you will see that at least 80% of the response time is spent loading the components on the page (scripts, images, CSS etc.) and only about 20% is spent downloading the actual HTML document (that includes all the back-end processing). It therefore behooves us to spend some time on front-end optimization if we want to significantly speed up our website loading times. There are 14 main points to look at when we’re trying to do this:
1. Try to make fewer HTTP requests
- try using image maps instead of having separate images
- you may also try using CSS sprites instead of separate images
- it is also sometimes possible to inline the images in your HTML page (base64 encoded)
- if you have multiple JavaScript or CSS files, get your build process to combine these into one master file (one for CSS one for JavaScript)
2. Use a content delivery network (CDN)
- a content delivery network is a collection of web servers distributed across multiple locations
- this allows browsers to download from servers that are geographically closer, which can speed up download times
- there are several CDN’s that major websites use e.g. Akamai, Mirror Image, Limelight etc.
3. Add a far future Expires header to all your resources
- more specifically, add a far future expires header to allow the browser to cache resources for a long time
- you can use apache mod_expires to take care of this for you
- you don’t get the savings the first time users visit (obviously), only on subsequent visits
- you should add far future expires headers for images, scripts and CSS
- you should introduce revision numbers for your scripts and CSS to allow you to modify these resources and not worry about having to expire what is already cached by the browser
- you can hook creating revision numbers for your scripts and CSS into your build process
4. Gzip components
- you should gzip your HTML pages, scripts and CSS when they are sent to the browser
- you can use apache mod_gzip (for 1.3) or mod_deflate (for 2.X) to handle all this for you
5. Put stylesheets at the top (in the document HEAD using the LINK tag)
- we want the website to render progressively in the browser (i.e. to show content as it becomes available), but many browsers will block rendering until all stylesheets have loaded, so loading stylesheets as soon as possible is preferable
- having CSS at the top may actually make the page load a little slower (since it can load stylesheets it doesn’t need), but it will feel faster to the users due to progressive rendering
6. Put scripts at the bottom
- normally according to the HTTP spec a browser can make two parallel requests to the same hostname, splitting components across multiple hostnames can improve performance
- scripts block parallel downloads, having scripts at the top will block all other components from downloading until the scripts have finished loading
- having scripts at the bottom allows all other components to load and take advantage of parallel requests
7. Avoid CSS expressions
- CSS expressions are evaluated very frequently, so can degrade page performance after it has loaded
- instead use one-time expressions or better yet use event handlers
8. Make JavaScript and CSS external
- if user visits infrequently, you’re better off inlining your CSS and JavaScript into your HTML as the page is unlikely to be in the browser cache anyway and this minimizes requests
- if users visit frequently, you’re better off having separate files for your CSS and JavaScript as this allows the browser to cache these components and only need to fetch the HTML page which is smaller due to the fact that CSS and JavaScript are externalized
9. Reduce DNS lookups
- if the browser or OS has a DNS record in it’s cache no DNS lookup is necessary which saves time
- you can use Keep-Alive to avoid DNS lookups, if there is an existing connection no DNS lookup is needed
- if there are fewer hostnames, fewer DNS loolups are needed, but more hostnames allow more parallel request
10. Minify your JavaScript
- this means removing unnecessary stuff from your scripts, such as spaces, comments etc. this makes the scripts much smaller
- you can also obfuscate, but the extra savings compared to minification are not worth it, especially if gzip is used
- you can use JSMin to minify your JavaScript
- you can also minify inline scripts
- minifying CSS is possible but usually not worth it
11. Avoid redirects
- redirects mean an extra request, and means that all other components are prevented from loading, this hurts performance
- don’t use redirects to fix trivialities such as missing trailing slash, this can be done through apache configuration
- you don’t need to use redirects for tracking internal traffic, you can instead parse Referer logs
12. Remove duplicate scripts
- often duplicate scripts creep in, this can make web pages larger and require more requests which hurts performance
- implement processes to make sure scripts are included only once
13. Configure or remove ETags
- ETags are used by servers and browsers to validate cached components
- if you’re using Expires headers the last modified date may be used by browsers to check if the component needs to be fetched, ETags are an alternative to the last modified date
- the problem is that ETags are constructed to be specific to one server, when a site is distributed across several servers this is an issue
- there are apache modules that can customize ETags to not be server specific
- if the last modified date is good enough, it is best to remove ETags
14. Make AJAX cacheable
- same rules (as above) apply to AJAX requests as to all the other requests, especially important to gzip components, reduce DNS lookups, minify JavaScript, avoid redirects and configure ETags
- try to make the response to AJAX requests cacheable
- add a far future expires header for your cachebale AJAX requests
That’s it, that was the whole book without having to spend $20-30. There are so many things that could have been expanded on in this book, examples, step-by step instructions, configuration snippets etc. With a bit more effort it could have been made into a valuable resource, worthy of it’s price tag. Alternatively it could have been priced in a fashion commensurate with the level/amount of content. The way it stands though, it was just a little annoying. Enjoy the summary.
__
For more tips and opinions on software development, process and people subscribe to skorks.com today.
Image by Alex C Jones