Variability Isn't the Problem
I was helping someone recently who had an issue where their Largest Contentful Paint oscillated between very different results. Most of the time, it was ~6 seconds, but every once in a while (maybe 2 out of 9 tests or so), it would be 11 seconds or more.
Their initial reaction was to get frustrated at the measurement process and tool in question—why was it messing up so frequently?
But in reality, when we looked closer, what we found was that they were doing some server pre-rendering and then reloading a bunch of content using client-side JavaScript.
Turns out, they had a race condition
If a certain chunk of CSS was applied before a particular JavaScript resource arrived, then for a brief moment, their hero element was larger, causing their Largest Contentful Paint to fire much earlier. When things loaded as they intended, the hero element was a bit smaller and the Largest Contentful Paint didn’t fire until a later image was loaded.
The tests weren’t flawed—they were exposing a very real issue with their website.
Variability in performance data is a frequent industry complaint, and as a result, we see a lot of tools going out of their way to iron out that variability—to ensure “consistent” results. And sometimes, yeah, the tool is at fault.
But just as often, going out of our way to smooth over variability in data is counterproductive. Variability is natural and glossing over it hides very real issues.
Variability itself isn’t the issue. Variability without providing a clear way to explore that variability and understand why it occurs? That’s the real problem.