What does it mean to “render” a webpage? This may sound like a simple question, but when you dive into the technical details you begin to realize how much work a browser does in an incredibly short amount of time. Knowing more about this process will allow you to make better decisions when it comes to optimizing the performance of your site.
While external factors often come into play, much of the responsibility for laying the foundation of a smooth experience rests on the shoulders of the developer. Here at Something Digital, we’ve found that many people don’t have a full understanding of the process a webpage takes to go from files on a server to a complete page in your browser – so it may be beneficial to gain a better understanding of what’s going on behind the scenes.
From a technical standpoint, the process for loading a webpage can be broken into four stages: navigation, parsing, rendering, and interaction. Let’s break each one of those steps down in detail.
First, the browser needs to retrieve all the necessary files from a remote server to create the initial page. This is limited by factors such as the end user’s internet speed and network latency. It is for this reason that data centers are often spread all over the world, and CDNs are often used to deliver content from a number of potential locations, depending on whichever is geographically closest.
“Time to first byte” (or TTFB) is a common measurement of the time it takes the browser to receive the first byte of a response, and should ideally occur in a half second or less.
Once the browser has all the necessary files, they can begin to be interpreted. This is when the files are read and the structure of the website begins to take shape through what’s known as lexical parsing and syntax analysis.
The Lexer breaks up code into “tokens” that can be easily processed, stripping out white space and other unnecessary characters. Each token is then passed to the syntax analyzer to apply language-specific syntax rules and added to the parse tree. In the event that syntax errors are found, this is where a runtime exception will be thrown.
Rendering is the multi-step process in which the content of the page begins to become visible to the user. This can be a relatively expensive task for the browser to perform, depending on the complexity of the styles and animations being rendered.
First the DOM and CSSOM are combined to create the “Render Tree” by traversing the DOM nodes and finding the appropriate CSSOM rules that apply to them. This only includes nodes that will occupy space in the layout, so if an element has display: none, it will be omitted from this tree.
Next the layout stage computes the exact size and positions of each node within the layout by creating a box model, and reserves that space on the page. This is also commonly referred to as “reflow”.
Content is then displayed to the screen through a process called “painting”. More complicated styles such as drop shadows are more memory intensive to compute and render, and may take longer to be painted than a solid color. The first meaningful paint represents the moment when the user is able to see meaningful content on your webpage for the first time.
Since parts of the page may have been drawn to different layers, compositing is the process in which the GPU is used to “flatten” these layers, ensuring the order of elements on the page remains correct.
Actions such as adding / removing elements from the DOM, changing inline styles, or resizing the window will cause additional reflow, since the browser needs to perform the above process again to calculate the new positions of elements. This is a user-blocking operation and should be avoided as much as possible as it can lead to an unpleasant user experience.
As images begin to be displayed, the smaller more compressed images will typically appear first. Newer image formats (such as WebP) also promise to reduce load times by more efficiently encoding the images, although not all formats are supported by modern browsers.
Finally, the interaction step is when the user can begin browsing and using the page. A page is considered “fully interactive” when all previous steps have completed and users can begin to scroll, type, and interact with elements on the page.
First CPU Idle represents the point at which the page is minimally interactive – meaning it has loaded enough information for it to be able to handle a user’s input. Most, but not all of the UI is interactive and the page responds to user input in a reasonable amount of time.
As animations begin to run, 60 frames per second is usually the target for a smooth frame rate – and anything less than that starts to become noticeable as “jank”. With lower frame rates, scrolling can appear choppy or become unresponsive altogether, whereas higher frame rates are usually indicative of a site’s overall responsiveness.
The process a browser takes to construct a webpage is far more complex than one might think. With each step comes more opportunities for developers to make decisions that can provide a smoother user experience and have fundamental impacts on increasing conversion rates.
Something Digital’s developers are passionate about getting your online store to be as fast as it can be. If that sounds like something you’d be interested in, reach out.
Written by: Alex Lyon, Front End Programmer