Recently I watched the team go through a launch, and then do some quick investigation and iteration. This exercise really brought to mind Jeff’s comments at an internal team event that we are all masters of our craft. I liked the sound of that phrase, and I see it in a lot of what we do… including this one example.
The Azure documentation page is a critical landing experience for our site, but it has often been a performance low-light for us with a page load time (in the US, on a good connection) of around 4 seconds. That’s not great, but the page is huge and has 100s of product images on it, so there is a limit to what can be done if we want it all on the same page like this.
Last week, we rolled out this great new page as described on the docs twitter feed (thanks Den!), which is an improvement in a ton of ways.
Unfortunately, it was slower than the page it replaced (going from 4s to 4.5s page load, or 12% slower). This type of impact is often hard to see until something is live, as our staging system isn’t cached the same as production, we are on a fast network, etc. We do automatically run perf tests after a deployment though, in production, and that’s what spotted this decrease.
That’s not great, as one of our principles for feature development is ‘first, do no harm’. We shouldn’t make our fundamental experience worse when we make a change, even if it makes other things better.
In the afternoon, after the page shipped, the main dev on this new page (Dillan) had a chat with a few other folks on the team, made a couple of changes (made those 100s of images lazy loaded, added dimensions to them to reduce reflow/recalc time as they load in) and had a new version up for testing in less than an hour. We also identified a possible improvement in cache configuration that the SRE team could do and Antony from that team had that staged as well. A bunch of folks piled-in to test/verify these changes and the changes are now live.
The azure docs page is now 25% faster than it was before this new version launched, down to a 3s page load (test details from WebPageTest.org). Return visits were improved even more, due to the cache changes, going from 2.6s to 1.9s, a 37% increase (test details).
Appreciate all the great work across both the Dev and SRE teams on this (and of course, the original work to build the new page that involved even more teams including PM and Design!), and happy to be working on a team that embraces this type of focus on quality.
Thoughts on this post? Feel free to reach out on Twitter!
2020-03-02 17:30 +0000
705cb31 @ 2021-01-25