From Awesome Bar to Parser (1/n)

I recently had the chance to fix a timing bug deep in the HTML parser of Firefox. As a result of the investigation, I got to learn lots about the way that the browser efficiently loads and parses HTML as it is downloaded from the network. One of the things that has always bothered me is that I never had a handle on the process, from start to finish, of how Firefox turns a user’s text input in the Awesome Bar into a network connection, then into a stream of HTML and, ultimately, into a rendered page on the screen. In this post I will explore the first part of one of those three areas by starting to answer the following question: How does Firefox translate a URL the user enters in the Awesome Bar into network activity?

At a very, very high level, the parts of Firefox visible to the end user are built in two separate components. First, there is an engine known as Gecko that renders HTML/CSS into a visual representation and executes JavaScript. Second, there is the part with which the user interacts that controls Gecko. This so-called chrome around Gecko consists of, for instance, the history UI, the developer tools, the preferences UI, and, most important for the purposes of this post, the Awesome Bar. See the image below.

The areas of a Firefox window under the control of Gecko are shown in green; the areas of a Firefox window under the control of the chrome are shown in blue.

Known internally as the URL Bar, the Awesome Bar is implemented with a model/view/controller design pattern. The controller is known as the UrlbarController and implemented in browser/components/urlbar/UrlbarController.jsm and handles the input that the user types into the Awesome Bar. So, when the user types in, say, cnn.com and then presses Enter, the UrlbarController hears those key presses and is responsible for carrying out the action that the user expects. In this case, that action is to tell Gecko to load the HTML of cnn.com and render it on the screen.

We start down the rabbit hole here, then. The switch statement in the handleKeyNavigation function of the UrlbarController is executed each time the user presses a key while the cursor is in the Awesome Bar. When the user presses Enter, the code on line 309 is executed and the handleCommand method of the controller’s input field is executed. The handleCommand method checks for any special actions that the user may have expected to happen based on their input. If no special actions are to be taken (ie, using a search engine to query the internet), the handleCommand method assumes that the user typed a URL and they meant to have Gecko load that website. handleCommand takes that URL from the Awesome Bar and passes control to its _loadURL method. Because the URL comes directly from user input, it is considered a trusted link. _loadURL uses openTrustedLinkIn of the window object to continue loading.

The openTrustedLinkIn method is defined in browser/base/content/utilityOverlay.js. This file contains a set of global functions that are needed throughout the implementation of the browser’s chrome. After guaranteeing the validity of its parameters, openTrustedLinkIn passes control to openUILinkIn, defined in the same file. Execution has a cup of coffee in openUILinkIn before continuing to openLinkIn. openLinkIn‘s primary responsibility is to translate its where parameter into a target browser, the browser that will render the contents of the URL entered by the user. In this case, where is the string "current", which is a canonical reference to the browser in the foreground tab. After translating the where parameter to a target browser, openLinkIn calls the target browser’s loadURI function.

The target browser’s loadURI method is bound to the global _loadURI function defined in browser/base/content/browser.js and this is the point where we can start to see the light at the end of the tunnel (sorry for mixing metaphors!). _loadURI invokes loadURI on browser‘s webNavigation object. The webNavigation object is …

It turns out that finishing that sentence is harder than it seems. The webNavigation object is a property of browser. That means that Javascript gives the implementer of browser the opportunity to write a getter function for it. And, remember, browser is just a normal web browser, right? Wrong!

Throughout the implementation of Firefox, “browser” is a term that carries several meanings. On many occasions, the reason for referring to an element as a “browser” is historical and no longer even remotely applicable. I have it on good authority that our engineers and developers apologize for this. In this context, browser is a UI element implemented as a XULFrameElement that represents the content area (i.e., where the contents of the web page are rendered) of the currently selected tab.

The normal implementation of the getter for the webNavigation property of a XULFrameElement is overidden and has very interesting behavior, to say the least. In this case, browser‘s isRemoteWebBrowser value is set to true and the getter returns its remote web browser. I am not even going to pretend that I understand. Fortunately, the immensely talented and smart Mike Conley came to my rescue:

The reason is historical. The WebNavigation property is supposed to be the interface through which one can command the browser to navigate somewhere. Before multi-process Firefox, this was an interface to the underlying DocShell that was running in the parent process.
With multi-process Firefox, that interface was kept, but we overrode it so that the commands were being “remoted” to another process over the message manager.
More recently, that remote-ing is now going mostly over IPC directly in the native layer by calling methods on the BrowsingContext instead.

Private correspondence with the author.

On marches Firefox by executing the loadURI method on a remote web browser, which turns out to be a canonical browsing context (1, 2).

Here, finally, Firefox transitions from execution of code written in Javascript to the execution of code written in C++. The journey's end draws near. A canonical browsing context is implemented by CanonicalBrowsingContext, a subclass of BrowsingContext. Both have LoadURI methods. However, their parameter list is different which makes them overloaded functions. Execution continues in CanonicalBrowsingContext::LoadURI before eventually continuing to BrowsingContext::LoadURI. CanonicalBrowsingContext::LoadURI creates a load state using the values of the parameters set by its caller. As its final act, CanonicalBrowsingContext::LoadURI invokes BrowsingContext::LoadURI with the newly created load state and NULL pointers for the other two parameters. Calling BrowsingContext::LoadURI in this way triggers a specific execution path.

The chrome of Firefox (along with several other subsystems of the browser) is implemented in a so-called parent process. Firefox uses one or more child processes to control the “execution” of the content in browser tabs. This post is following a journey initiated by the user from the UI, so the actions executed to this point have been done in the context of the parent process. Because of the way that CanonicalBrowsingContext::LoadURI invoked BrowsingContext::LoadURI and the fact that it is executing in the context of the parent process, BrowsingContext::LoadURI simply transfers the user’s request to load a page to the proper child process – the one controlling the “execution” of the content of the current foreground tab, which in a previous context was called the target browser.

As I’ve been writing, I’ve been imagining myself telling this story over a campfire to a circle middle schoolers who thought they were going to hear a scary story. Depending on how you have felt reading this, they may or may be disappointed by what they’ve heard so far.

Either way, this is right about where the story gets really good so it seems like a good place to pause, take a break and inhale. When I started writing, I thought that I knew the entire execution path from the research I had done trying to fix the timing bug. I was wrong. Putting this journey into words has forced me to reconsider several of my assumptions and, as a result, I’ve learned some very important lessons.

Join me again soon when we continue our trek From Awesome Bar to Parser!

I want to say a special thank you to Mike Conley for providing valuable feedback on this post. Any errors that remain are my fault, however. Try as he might, sometimes I just couldn’t be taught.

Leave a Reply

Your email address will not be published. Required fields are marked *