Performance on the web: What’s being said?
It is not an uncommon belief that software is in a state of decline, and it's not too difficult to see why. The difference between Intel's i7-2600k which sold for ~$300USD in 2011 and AMD's Ryzen 5600X (which would sell for US$299 today if not for silicon shortages) is astonishing, and yet my computer doesn't feel substantially faster than it used to. On my high-spec Windows 10 desktop I find myself waiting for silly things (like opening a context menu) multiple times throughout the day which I just don't remember happening on my Linux Mint install a decade ago.
To be fair Windows 10 has a lot more functionality built in to it compared to that old Linux Mint version, but I don't think there's enough added functionality in order to justify the added slowness, especially given that the hardware I'm using today is orders of magnitude better than what I had back then. I'm far from the first person to note software feeling slower than it used to -- in fact, dating all the way back to 1991, Dr. Martin Reiser wrote in the preface to The Oberon System User Guide and Programmer's Manual the following:
The hope is that the progress in hardware will cure all software ills. However, a critical observer may observe that software manages to outgrow hardware in size and sluggishness.
The extraordinary leaps we've achieved in hardware allows software to be less efficient and 'get away' with it, as users are less likely to notice. We've also managed to become more productive (at least in terms of feature development) through widespread adoption of high-level programming languages and the proliferation of package ecosystems, although at a cost to robustness and performance.
Internet applications are the worst of them all, as the time it takes to download files off of a server can be infuriatingly high and then once everything's running, it's subject to however the browser decides to execute it and schedule work. For certain workloads the lack of control over how the app executes is a deal-breaker, and for others it's just annoying and results in unpredictable performance hiccups. This is worsened when developers don't put in effort to mitigate the limitations of the web platform, either due to a lack of empathy for their end user or due to an overly-aggressive product roadmap.
Facebook is, I think, a really good case study in poor web performance -- it takes a couple of seconds for me to see my news feed's loading skeletons, and then it takes another ~6s for Facebook to actually load the content. Two seconds wouldn't be a bad amount of loading time if the point of Facebook was to display loading skeletons, but when I open Facebook the goal I'm actually trying to accomplish is to read the ~300kB of text recently uploaded by people I vaguely know. While Facebook has a lot more functionality than just the news feed, it's literally their #1 feature and needing to wait 7-8s to access it from an empty cache shouldn't be acceptable especially given how much effort they apparently put in to making it load quickly.
So what if everything's slow?
We need to take software performance seriously because there are real consequences when we don't. Every gigabyte of data downloaded from the internet produces about 3 kg of CO21, which directly contributes to the degradation of our planet. By reducing the size of our applications we can make substantial reductions to global greenhouse gas emissions, and we can do even better than that if we also optimize our use of computing resources.
PHP7 released at the end of 2015 and Rasmus Lerdorf wasted no time in proclaiming that if 50% of all applications running PHP5 were to upgrade to the newest version, the world would see a 3.75B kg/year reduction in carbon emissions due to performance improvements. Is that a lot of carbon? In the grand scheme of things, not really. But incremental improvement in this space shouldn't be the enemy of perfect. The Ahuora project is looking to improve New Zealand's process heat emissions which added up to 8.3B kg CO2 in 2019,
By building slow software we're also locking out billions of people. The average smartphone in developing regions of the world cost about US$200 in 2019 and that value is only going down over time2. People in these parts of the world come online when they can afford to, and that precludes them from purchasing new iPhone 12s. Combine poor network infrastructure with those low-spec phones and a disturbing proportion of the Internet is completely unusable for them.
Your milage may vary, but when I benchmarked Reddit it took 4.7s to display anything on the screen when using a 1.6Mbps connection. 3G networks in emerging markets offer bandwidth of about 0.4Mbps, which means that people in those parts of the world are looking at almost 20 seconds of load time before they can see Reddit's loading UI. After that, they need to wait for all of the actual content to load! At the time of writing this, Reddit is the 19th most popular website in the world and it is is simply not usable by a large portion of the world.
Reddit is only an example here -- there are a huge number of websites which are similarly slow. If we genuinely believe that the products we're developing are for the enrichment of society and betterment of human wellbeing, then we should ensure those products are accessible to everybody. For a global business like Reddit it also seems astonishing that they would so happily close themselves off from a billion+ potential sets of eyeballs they could be showing ads to. Even if you aren't operating at planet scale, here in the first world the difference between your web page loading in 3 seconds instead of 1 second is a 32% increase to your bounce rate so there's still money being left on the table.
A case study: Kwotimation
At Kwotimation we have a strong sense of social responsibility. We're building a high-tech solution for an industry which, frankly, no one really cares about and later this year we want to get our users making money off of the products they use during work -- which hasn't happened before. Performance of our software is important to us because it ties in to core cultural values and at the same time serves an important business purpose.
The performance for the initial release of our software was pretty mediocre. Not great, but better than the average web app out there in 2021. Our core web vitals -- the main KPIs used today to measure web app performance, in tandem with Lighthouse audits -- have always been in the green or barely in the yellow. With dedicated effort, however, we can do a lot better than that.
Assets & Dependencies
Probably the most important variable influencing page speed is the size of the resources on the apps 'critical path.' Critical path resources consist of whatever is required to render a minimally interactive page to the end user. Most web apps out there can make improvements in this space as it is incredibly easy for the critical path to creep larger over time as features are developed and code is refactored. Mitigating factors are developer discipline and automating as much of the process as possible -- for instance, with a Webpack loader like image-webpack-loader
it is possible to optimize images at build time. The create-react-app
template reports bundle sizes (and the delta from the last bundle) at build time, but unless you're digging into your CI log for every merge request it's not particularly visible. Setting up a Dangerfile and/or configuring Webpack performance budgets are great ways of ensuring that bundle size doesn't get ignored.
Dependency audits are also important. We saved XX kB of Javascript simply by swapping out heavy dependencies for lightweight alternatives. In particular, Mixpanel's browser client is massive. Simply replacing the official Mixpanel package with mixpanel-lite
drops bundle size by 23.6 kB gzipped which is a big win. Mixpanel is aware of their bundle size issue but unfortunately doesn't consider it a priority. On this site, I used to use the color
module for color manipulation prior to swapping to tinycolor2
which had all of the functionality I was using. The difference between these two modules is only ~2.4 kB gzipped, but that tiny difference was all it took to push this website from a score of 99 to 100 in my Vercel performance analytics dashboard.
For Kwotimation, there are further reductions in this area on the horizon. We use Sentry for APM, and the next major version of their browser SDK is aiming to "dramatically reduce bundle size" which we're excited for given that it currently weighs in at 21.2 kB gzipped.
Sometimes it's also possible to perform calculations at compile time in order to save compute and/or bundle size. A great example of this is the ms
module, which allows you to convert human-readable strings like "1 day" into their millisecond equivalent (in this case, "86,400,000") which makes for a much nicer development experience when you need to use time-based APIs. ms
is a pretty small library -- less than a kilobyte gzipped -- but it is possible to completely remove ms
from your bundle and avoid the need to perform any calculations on your user's computer by instead using ms.macro
. The macro version runs at build time, so all that remains in your final bundle is the number calculated by ms
. Build-time macros are awesome because they allow for a great developer experience without needing to impact your end users, although they aren't super widespread yet.
Font subsetting is the process of taking a font file and breaking it up into smaller chunks. You might take a font and break it up into an "English" chunk and a "Vietnamese" chunk, as an example. If no Vietnamese characters are used on your web page then the Vietnamese file won't get downloaded by the user's browser and therefore the page loads faster. Space savings on fonts are particularly valuable as pending fonts will block your website from rendering (unless you configure them not to, in which case you then need to deal with layout shift).
If you're just pulling fonts from Google Fonts then subsetting is just a matter of opting in to the feature and telling Google which subsets you need. If you're using a commercial font, however, things can be tricky as your license needs to grant you permission to either modify the font or to subset it. A lot of commercial font licenses won't give you that out of the box. We use the Cerebri Sans Pro font family by Hanken Design Co. and license it through MyFonts.com.
MyFonts initially wouldn't allow us to subset our Cerebri Sans font files despite Hanken Design Co. being fine with it. We had to wait a few weeks for MyFonts to discuss the situation with Hanken before we were finally able to proceed with subsetting -- Hanken are awesome for putting in that effort for us, even after we'd offered to just purchase the fonts directly through them in order to sidestep MyFonts. We wound up saving 44 kB on our font files by subsetting them, and if we were to be more aggressive we might be able to push that up closer to 60 kB. Make sure to pay attention to what your font license allows you to do, because 44 kB is massive.
Network Requests
Networks are generally pretty quick on desktop computers if you're targetting an urban audience, but they get really bad really quickly once you leave the city or start using mobile devices. In the worst case it can take 2s on a 3G connection to initiate a connection to a remote server -- and that's just so that you can start downloading content. DNS resolution and TCP+SSL handshakes take a long time to complete when latency is high or packets drop.
The only way to improve this situation is to reduce the number of network requests you make. Pushing your content to the edge only gets you so far when mobile networks are as unreliable as they are today. Luckily, reducing network requests is usually pretty straightforward. In Kwotimation's case we had a chain of two network requests which had to be executed before we could show anything to the user: firstly we had to get an access token from the server in order to interact with the API3, and then secondly we had to then query the API for the tiler's website design information (such as their business color, logo, etc.) Even on a desktop with a fast fiber connection, these requests took ~200ms which is a noticeable delay. On mobile the situation was even worse.
The easy fix to this is to cut out the second network request by bundling the design information alongside the access token. This cuts our delay in half, but it's still not great needing to wait ~100ms in the best case scenario for the website to show up. We could try firing the network request earlier by moving it out of our main Javascript bundle and instead including it as a head script, but there's all sorts of nastiness which that approach would entail. Instead, we've implemented a lambda which we run every time a tiler updates their website that inlines their design info into their website's HTML.
In addition to completely removing our network requests, this also gives us the nice benefit of being able to inject Open Graph meta tags into the HTML in order to get nice website previews on websites like Facebook. In future we'd like to do full SSR where we render the entire website into static HTML, but there are some tooling issues we need to resolve first.
Conclusion & Future Work
That's pretty much it. We haven't done anything particularly fancy in order to get our site loading faster. To be honest, most slow websites out there today are probably only a week away from seeing big performance improvements if someone were to put in a bit of work. And if you do get fancy you can achieve really fast pages.
Right now our tiler websites have a Lighthouse score between 90-96 on desktop and 80-95 on mobile. There's room for improvement here, but we're also measuring that in the real world our 75th-percentile first contentful paint occurs in 832 ms, so we're happy with where we are for now. In the near future, some of the strategies we have in mind for further improving our page speed are:
Using HTTP/2 push. Unfortunately, Amazon CloudFront doesn't support this technology. We'll need to migrate to another CDN like Cloudflare at some point in the future.
Prerendering tiler websites. Rendering our websites down to actual static HTML would be a big win for page load speed, as it means the user can see the website before any of the Javascript has even begun downloading. Server-side rendering has the ability to worsen user experience if you haven't done any proper performance engineering, however, as anything which requires Javascript on the page (such as a button which opens a modal) won't be functional until that Javascript has downloaded, parsed, and executed. If that takes a long time your users can wind up getting frustrated, but we're well-positioned to employ SSR after the work we've done to make things speedy.
Shipping a modern JS bundle. Web apps are typically developed in a 'modern' version of Javascript and then 'transpiled' into a version of Javascript which browsers can understand. The transpiled code can be anywhere up to 25% larger than the original modern code. This is surprisingly difficult to do correctly -- at the end of 2020 Next.js removed their experimental "modern mode" due to technical issues.
Dropping Internet Explorer support. Our support for IE is pretty barebones at the moment, and while you can use our software in IE11 it's not ideal. So far, we've only seen about three instances of folks using IE on Tiling.co.nz and those people didn't even go on to request a quote. Given that we haven't seen anyone use this browser and that Microsoft themselves started phasing out IE11 support in their own web apps at the end of last year, we're likely to follow suit. Polyfills for IE11 are huge, and we'll save ~42 kB by dropping them which is almost a quarter of our critical path Javascript at the moment.
Reworking our file setup. We offer a variety of domains to our tilers -- they could have their website live at sophia.kwoting.com or sophia.tilingnow.com, for instance -- and we store all of the underlying code for these websites on cdn.kwotimation.com
. We've made the assumption that people looking for a tiler will shop around and get a number of different quotes before making a decision on who to use. If this assumption is correct, we get to cache all of the content on our CDN even when the user accesses tilers who are using different top-level root domain names. If the assumption is wrong, however, it means that we're making users go through DNS resolution for cdn.kwotimation.com
for no reason. We're looking at our data in order to determine what the best strategy is going forward.
Implementing two-stage font loading. We use two fonts on our tiler websites: a bold font for headings, and a lighter font for body text. After subsetting, the two fonts we use weigh in at ~18 kB each. We can therefore save 18 kB on our critical path by only loading the body text font initially and deferring the load of our bolder font. This will result in a bit of layout shift but is likely still worth doing considering the possible improvement we'd get on our time to first render.
Moving off of MyFonts. We're planning to purchase our Cerebri Sans fonts directly from Hanken Design Co. as MyFonts requires us to link to a file on their website which is used to track our monthy page view quota. We currently load this file asynchronously in order to not block rendering, but it's still there in our network request waterfall and we want it gone.
Removing a few straggler dependencies. I love react-spring
because it makes animations so easy, but it adds almost 11 kB to our page and our use of it could be replaced relatively easily with some plain CSS animations. So eventually this has to go.
The big ones on the list are, again, cleaning up dependencies and dropping legacy code. Removing support for IE11, replacing react-spring
, and a couple of other changes are enough to bring us down to the 170 kB critical path budget set by Alex Russell in his fantastic 2017 blog post.
Even if you only have a small team, performance matters. Bounces due to frustation with load speed have a direct impact on your bottom line, and having a slow website sends a powerful message about your commitment to climate change and global equity. It's fine to not be perfect, but it shouldn't be acceptable to build software that's functionally inaccessible to a massive chunk of the world's population. Today's consumers are more conscientious than ever, and I hope that slow software can be the next thing
- This figure represents "embodied carbon" and therefore factors in the fixed equipment costs such as actually laying down the fiber optic cable between your computer and the server you're downloading from. If we only consider the CO2 involved in transfer, it's probably somewhere around 29g↩
- 2020 is an outlier year where the average selling price went up. The economic impact of COVID-19 has low-working class individuals cutting back on discretionary purchases, and most of these devices are purchased in-person -- which isn't possible when in lockdown. I expect the average selling price to start declining again after the world turns the corner on the pandemic.↩
- Strictly speaking there's no reason why we need a token instead of just allowing anonymous access to the API. The endpoints for our tilers' websites might as well be public. This is a legacy decision made by previous developers and while we want to fix it, it's a surprising amount of work. There isn't much benefit to doing that work, either, given the solution we developed for our blocking network requests.↩