Understanding WebKit behavior via lldb #

I recently ran into some puzzling WebKit scrolling behavior: child iframes mysteriously causing the main window to get scrolled. This was in the context of a Quip feature still under development, but I've recreated a simple test case for it, to make it easier to follow along. There are two buttons on the page, both of which dynamically create and append an <iframe> element to the page. They convey parameters to the frame via the fragment part of the URL; one button has no parameters and the other does, but they otherwise load the same content. The mysterious behavior that I was seeing was that the code path without parameters was causing the main window to scroll down (such that the iframe is at the top of the visible area).

With such a reduced test case it may already be obvious what's going on, but things were much less clear at the time that I encountered this. There were many possible causes since we had made a major frame-related infrastructure change when this started to happen. The only pattern was that it only seemed to affect WebKit-based browsers (i.e. Safari and especially our Mac app). After flailing for a while, I realized what I wanted most of all was a breakpoint. Specifically, if I could break in whatever function implemented page scrolling, then I could see what the trigger was. Some quick monkey-patching of the scrollTop window property showed that the scrolling was not directly initiated by JavaScript (indeed the bug could be reproduced entirely without JavaScript by inlining the iframe HTML directly). Therefore such a breakpoint needed to be on the native side (in WebKit itself) via lldb.

The first task was to attach a debugger to WebKit. It's been a few years since I've built it from source, and I didn't relish having to wait for the long checkout and build process. Unfortunately, lldb doesn't seem to want to be attached to Safari, presumably because System Integrity Protection (SIP) disallows debugging of system applications. Fortunately, nightly builds of WebKit are not protected by SIP, and they exhibited the same problem. To figure out which process to attach to (web content runs in a separate process from the main application), Apple's documentation revealed the helpful debug option to show process IDs in page title. Thus I was able to attach to the process rendering the problematic page:

$ lldb
(lldb) process attach --pid 15079
Process 15079 stopped
...

The next thing to figure out was what function to break in. Looking at the implementations of scrolling DOM APIs it looked like they all ended up calling WebCore::RenderObject::scrollRectToVisible, so that seemed like a promising choke point.

(lldb) breakpoint set -M scrollRectToVisible
Breakpoint 1: 2 locations.

(the output says that two breakpoints are set, since it also matches WebCore::RenderLayer::scrollRectToVisible, but that turned out to be a happy accident)

After using continue command to resume execution and reproducing the problem, I was very happy to see that my breakpoint was immediately triggered. I could then get the stack trace that I was after:

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.2
  * frame #0: 0x000000010753eda0 WebCore`WebCore::RenderObject::scrollRectToVisible(WebCore::SelectionRevealMode, WebCore::LayoutRect const&, bool, WebCore::ScrollAlignment const&, WebCore::ScrollAlignment const&)
    frame #1: 0x0000000106b5da64 WebCore`WebCore::FrameView::scrollToAnchor() + 292
    frame #2: 0x0000000106b55832 WebCore`WebCore::FrameView::performPostLayoutTasks() + 386
    frame #3: 0x0000000106b59959 WebCore`WebCore::FrameView::layout(bool) + 4009
    frame #4: 0x0000000106b5d878 WebCore`WebCore::FrameView::scrollToAnchor(WTF::String const&) + 360
    frame #5: 0x0000000106b5d659 WebCore`WebCore::FrameView::scrollToFragment(WebCore::URL const&) + 57
    frame #6: 0x0000000106b39c80 WebCore`WebCore::FrameLoader::scrollToFragmentWithParentBoundary(WebCore::URL const&, bool) + 176
    frame #7: 0x0000000106b389c8 WebCore`WebCore::FrameLoader::finishedParsing() + 120
    frame #8: 0x00000001069d3e0a WebCore`WebCore::Document::finishedParsing() + 266
    frame #9: 0x0000000106bfb322 WebCore`WebCore::HTMLDocumentParser::prepareToStopParsing() + 162
    frame #10: 0x0000000106bfc1b3 WebCore`WebCore::HTMLDocumentParser::finish() + 211
    ...

It looked like WebKit had decided to scroll to an anchor, which was surprising, since I wasn't expecting any named anchors in the document. After reading through the source of WebCore::FrameView::scrollToAnchor I finally understood what was happening:

// Implement the rule that "" and "top" both mean top of page as in other browsers.
if (!anchorElement && !(name.isEmpty() || equalLettersIgnoringASCIICase(name, "top")))
    return false;

As a side effect of the infrastructure change, the frame no longer had any parameters in the fragment part of the URL, but the code that was generating the URLs would always append a #. This empty fragment identifier would thus be marked as requesting a scroll to the top of the document. Once execution continued, we would end up in the previously-mentioned WebCore::RenderLayer::scrollRectToVisible method, which recurses into the parent frame, thus scrolling the whole document.

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x00000001074e0f80 WebCore`WebCore::RenderLayer::scrollRectToVisible(WebCore::SelectionRevealMode, WebCore::LayoutRect const&, bool, WebCore::ScrollAlignment const&, WebCore::ScrollAlignment const&)
    frame #1: 0x00000001074e143d WebCore`WebCore::RenderLayer::scrollRectToVisible(WebCore::SelectionRevealMode, WebCore::LayoutRect const&, bool, WebCore::ScrollAlignment const&, WebCore::ScrollAlignment const&) + 1213
    frame #2: 0x00000001074e143d WebCore`WebCore::RenderLayer::scrollRectToVisible(WebCore::SelectionRevealMode, WebCore::LayoutRect const&, bool, WebCore::ScrollAlignment const&, WebCore::ScrollAlignment const&) + 1213
    frame #3: 0x000000010753ee55 WebCore`WebCore::RenderObject::scrollRectToVisible(WebCore::SelectionRevealMode, WebCore::LayoutRect const&, bool, WebCore::ScrollAlignment const&, WebCore::ScrollAlignment const&) + 181
    frame #4: 0x0000000106b5da64 WebCore`WebCore::FrameView::scrollToAnchor() + 292
    frame #5: 0x0000000106b55832 WebCore`WebCore::FrameView::performPostLayoutTasks()
    ...

The fix was then trivial (remove the # if no parameters are needed), but it would have taken me much longer to find if I had treated the browser as a black box. As a bonus, reading through the WebKit source also introduced me to the “framesniffing” attack. The guards against this attack explained why the Mac app was most affected. There the main frame is loaded using a file:/// URL and based on WebKit's heuristics it can access any other origin, allowing the anchor scroll request to cross frame/origin boundary.

1 Comment

Good luck for everyone! No doubt that we need to be more educated to understand such things. I am much obliged. Have a nice weekends.

Post a Comment