Today I spoke at Halfstack Online, an online version of the Halfstack London conference I have spoken at for the past 5 years. The title for this talk is Programatically Performant, a talk all about how as developers we should focus more time on capturing web performance metrics from our site so we can make informed decisions on how to improve using data. Having given the talk I realised how it might be useful to write a blog post on this topic as well, and that was how this post was born.

The different kinds of data

Web performance data about your site can be split into two kinds of data

The first type of data we will collect is synthetic data. Synthetic data, as the name suggests is data captured in a lab like setting, usually this means running your test from a server in a enviornment that has a consistant internet connection. Quite often as developers we will run these in platforms like AWS or Google Cloud Platform.

The second type of data we will look at collecting is Real User Metrics, this is often abbreveatied as RUM. Real User Metric are captured in the users browsers and reported back to your server using JavaScript.

For both kinds of data there are 5 key metrics we should be looking at:

  • Time to first byte (TTFB) - The time that it takes for a browser to receive the first byte of content
  • First Input Delay (FID) - The time from a user first interacting with page to the time when the browser first responds
  • First Contentful Paint (FCP) - The time until the first point a user can see something on their screen
  • Largest Contentful Paint (LCP) - The time until when the page’s main content has likely loaded
  • Cumulative Layout Shift (CLS) - A measure of how much the layout shifts unexpectantly, this metric will highlight if you have a problem with how your website loads assets.

Collecting Synthetic Data

To collect synthetic data we need an environment that will give us repeatable results. Usually this is on a server where you can have a controlled environment.

One way in which we can do this is run Chrome in a docker container. The way that I do this is use the Browserless docker image avaliable from https://github.com/browserless/chrome.

One of the cool things about the Browserless docker image is it exposes an API endpoint that will allow you to run Lighthouse audit tests againt a website. To do this we simply need to call the /stats endpoint on the docker image with a body containing the url. The below example will shows this with the output of the response being logged to the console.

const fetch = require('node-fetch');

// Connecting to Lighthouse on Chrome docker image provided by http://browserless.io
const results = await fetch('http://localhost:32768/stats', {
  method: 'post',
  body: JSON.stringify({
    url: 'https://www.jonathanfielding.com',
  }),
  headers: { 'Content-Type': 'application/json' },
}).then((r) => r.json());

console.log(results)

Having got our Lighthouse audit back from Browserless we can now start to pick out the metrics we want to keep track of.

// Note: Largest Contentful Paint and Cumulative Layout Shift 
console.log('FCP', results.audits['first-contentful-paint'].numericValue);
console.log('TTFB', results.audits['time-to-first-byte'].numericValue);
console.log('FID', results.audits['max-potential-fid'].numericValue);

While running in a synthetic environment will provide a consistent environment, there can be still some natural variance between tests. To reduce the impact this has on our metrics we can run multiple tests and create an average. In the below example I have refactored the prevous code examples to make it easy to run multiple tests.

const fetch = require('node-fetch');

async function runLighthouse (url) {
  return await fetch('http://192.168.0.154:32768/stats', {
    method: 'post',
    body: JSON.stringify({
      url,
    }),
    headers: { 'Content-Type': 'application/json' },
  }).then((r) => {
    return r.json()
  });
}

function getNumericValueAverage(results, key) {
  return results.reduce((ac, result) => {
	return ac + result.audits[key].numericValue; 
  }, 0) / results.length;
}

const results = await Promise.all([
  runLighthouse('https://www.jonathanfielding.com'),
  runLighthouse('https://www.jonathanfielding.com'),
  runLighthouse('https://www.jonathanfielding.com'),
]);

const ttfb = getNumericValueAverage(results, 'time-to-first-byte');
const fcp = getNumericValueAverage(results, 'first-contentful-paint');
const fid = getNumericValueAverage(results, 'max-potential-fid');

console.log('Time to first byte', ttfb);
console.log('First contentful paint', fcp);
console.log('First input delay', fid);

Hopefully these short examples will help you to understand how to capture these metrics. Having collected this data we now want to store it.

For this I am going to use BigQuery which is a data warehouse product from Google. It allows us to store and query our performance data very cheaply with a very generous free tier as well to help you get started.

Building on top of our last example, we want to store our data stored in the ttfb, fcb and fid variables in BigQuery along with a timestamp so we know when the test was run. We then use the Google BigQuery npm module to connect to BigQuery and then create the rows we want to insert and then insert them.

const {BigQuery} = require('@google-cloud/bigquery');
const bigquery = new BigQuery();

const rows = [{
  timestamp: new Date(),
  ttfb: getNumericValueAverage(results, 'time-to-first-byte'),
  fcp: getNumericValueAverage(results, 'first-contentful-paint'),
  fid: getNumericValueAverage(results, 'max-potential-fid'),
}]

await bigquery
  .dataset('blog')
  .table('synthetic')
  .insert(rows)

console.log('Inserted perf metrics into BigQuery');

Analysing our synthetic test data

Having stored our synthetic test data in BigQuery we can now start to analyse it.

To start with lets get all the data from our tests.

const { BigQuery } = require('@google-cloud/bigquery');
const bigquery = new BigQuery();

const options = {
  query: 'SELECT * FROM `performance-experiments.blog.synthetic`',
  location: 'US',
};

// Run the query as a job
const [job] = await bigquery.createQueryJob(options);

// Wait for the query to finish
const [rows] = await job.getQueryResults();
rows.forEach(row => console.log(row));

As you can see, there is multiple tests some days so we will want to group these so we can see trends better over time.

const { BigQuery } = require('@google-cloud/bigquery');
const bigquery = new BigQuery();

const options = {
  query: `SELECT 
    timestamp_trunc(timestamp, DAY) as date,
    AVG(ttfb) as ttfb, AVG(fcp) as fcp, AVG(fid) as fid 
    FROM \`performance-experiments.blog.synthetic\`
    group by date`,
  location: 'US',
};

// Run the query as a job
const [job] = await bigquery.createQueryJob(options);

// Wait for the query to finish
const [rows] = await job.getQueryResults();
rows.forEach(row => console.log(row));

Having grouped the data we can then start to visualise it, in the example below I am using a library called ervy which is a graph library for the CLI.

const { bar } = require('ervy');
const barData = rows.map((row) => {
  return {
    key: row.date.value.split('T')[0].replace('2020-',''),
    value: Math.round(row.ttfb),
    style: '*'
  };
}).reverse();

console.log(bar(barData));

These visualisations allow us to understand the performance of site day to day and will allow us to see where changes we have made improve or negatively effect performance.

Sythetic data like this will allow us to see trends in our website’s performance but will not tell us what our users are experiencing, thats why we also need to look at Real User Metrics.

Capturing Real User Metrics (RUM)

Real user metrics are performance metrics captured in our users browsers.

The Chrome team have released a JavaScript library that makes this easy called web-vitals.

You can use web-vitals in two ways, the first is by importing it into your project as a npm module. You then would need to build it with a build tool like rollup or webpack.

// Start by importing the metric you want to get
import {getFCP} from 'web-vitals';

// Measure and log the current First Contentful Paint value,
// any time it's ready to be reported.
getFCP(console.log);

If you want to get started using it quickly then you can use unpkg. In this example below we are logging the web-vitals metrics to a div with the id log. We have loaded web-vitals along with a polyfill for First Input Delay from unpkg and then used this in our code.

<div id="log"></div>
<script src="https://unpkg.com/[email protected]/src/first-input-delay.js"></script>
<script src="https://unpkg.com/[email protected]/dist/web-vitals.es5.umd.min.js"></script>
<script>
  const $log = document.querySelector('#log');
  const log = (newLog) => {
    $log.innerHTML += `<pre>${JSON.stringify(newLog, null, 4)}</pre>`;
  };

  webVitals.getFCP(log);
  webVitals.getTTFB(log);
</script>

Once you figured out what data to capture from your users you need to store it somewhere that you can query this data and analyse it.

There are many tools you could pipe this data into, you could put it into your Google Analytics data, push to your data layer with Google Tag Manager or you could use a platform like Segment which will allow you to pipe data anywhere you need it.

So thats what I am going to be focusing on, using Segment to pipe the data from our users browser into Google BigQuery (I promise this isn’t a BigQuery sponsored post 😂). To do this we first need the Segment tracking code, and then we can add the following below. This will simply create an event in Segment for every metric we capture.

function log(event) {
  analytics.track(event.name, {
    value: event.value,
  });
}

webVitals.getFID(log);
webVitals.getTTFB(log);
webVitals.getCLS(log);
webVitals.getLCP(log);
webVitals.getFCP(log);

Analysing the real user metrics

Having collected the data and stored it in bigquery we can begin to analyse it.

The first way I am going to analyse the data is to look at the Median FCP over past 7 days. The reason I normally look at the median first is that it isn’t impacted by the outliers in the same way as the average is.


const { BigQuery } = require('@google-cloud/bigquery');
const bigquery = new BigQuery();

const query = `SELECT 
  timestamp_trunc(timestamp, DAY) as date,
  APPROX_QUANTILES(value, 100)[OFFSET(50)] AS median
  FROM \`performance-experiments.blog.fcp\`
  group by date limit 7`;

const options = {
  query,
  location: 'US',
};

// Run the query as a job
const [job] = await bigquery.createQueryJob(options);

// Wait for the query to finish
const [rows] = await job.getQueryResults();
rows.forEach(row => console.log(row));

Having queried our data, we can then start to plot this data on a graph to see how the median changes over time.

const { BigQuery } = require('@google-cloud/bigquery');
const bigquery = new BigQuery();

const query = `SELECT 
  timestamp_trunc(timestamp, DAY) as date,
  APPROX_QUANTILES(value, 100)[OFFSET(50)] AS median
  FROM \`performance-experiments.blog.fcp\`
  group by date order by date limit 7`;

const options = {
  query,
  location: 'US',
};

// Run the query as a job
const [job] = await bigquery.createQueryJob(options);

// Wait for the query to finish
const [rows] = await job.getQueryResults();

const { bar } = require('ervy');

const barData = rows.map((row) => {
  return {
    key: row.date.value.split('T')[0].replace('2020-',''),
    value: Math.round(row.median),
    style: '*'
  };
});

console.log(bar(barData));

Having queried our data, we can then start to plot this data on a graph to see how the median changes over time.

const { bar } = require('ervy');

const barData = rows.map((row) => {
  return {
    key: row.date.value.split('T')[0].replace('2020-',''),
    value: Math.round(row.median),
    style: '*'
  };
});

console.log(bar(barData));

Along with the Median, to understand what a bigger percentage of our users experience we should look at the percentiles.

Percentiles are important because they allow us to understand the curve of our performance metrics

const { BigQuery } = require('@google-cloud/bigquery');
const bigquery = new BigQuery();

const query = `SELECT 
  timestamp_trunc(timestamp, DAY) as date,
  APPROX_QUANTILES(value, 100)[OFFSET(50)] AS median,
  APPROX_QUANTILES(value, 100)[OFFSET(75)] AS p75,
  APPROX_QUANTILES(value, 100)[OFFSET(95)] AS p95
  FROM \`performance-experiments.blog.fcp\`
  group by date limit 7`;

const options = {
  query,
  location: 'US',
};

// Run the query as a job
const [job] = await bigquery.createQueryJob(options);

// Wait for the query to finish
const [rows] = await job.getQueryResults();
rows.forEach(row => console.log(row));

We can also then visualise that data alongside our median graph and compare them.

const { bar } = require('ervy');

console.log('');
console.log('*** Median ***');
console.log('');

console.log(bar(rows.map((row) => {
  return {
    key: row.date.value.split('T')[0].replace('2020-',''),
    value: Math.round(row.median),
    style: '*'
  };
})));

console.log('');
console.log('*** 75th Percentile ***');
console.log('');

console.log(bar(rows.map((row) => {
  return {
    key: row.date.value.split('T')[0].replace('2020-',''),
    value: Math.round(row.p75),
    style: '*'
  };
})));

console.log('');
console.log('*** 95th Percentile ***');
console.log('');

console.log(bar(rows.map((row) => {
  return {
    key: row.date.value.split('T')[0].replace('2020-',''),
    value: Math.round(row.p95),
    style: '*'
  };
})));

This was just a small sample of the things you can do with the data once you have it.

Beside simply looking at the data in this way, you can also start to combine it with your other analytics data. One example might be if you are selling cars (Electric of course), everytime you sell a car you might capture a conversion metric. You can then correlate these conversions with the performance of your site to see if there is a strong correlation between sales and performance.

In Summary

By using a combination of Synthetic Data and Real User Metrics we can start to understand the performance of our site.

The Synthetic Data will help you keep track of any changes in performance by using a controlled environment.

While the Real User Metrics will help you keep track of what your users are experiencing

By keeping track of both kinds of data you will have a full picture of your sites performance and you will know when you need to take time out to fix performance problems.

Are you looking for your next role?

I work as an Lead Engineer at RVU where we are currently looking for Full Stack software engineers based in our London office.

Find out more