Understanding WebKit behavior via lldb #

I recently ran into some puzzling WebKit scrolling behavior: child iframes mysteriously causing the main window to get scrolled. This was in the context of a Quip feature still under development, but I've recreated a simple test case for it, to make it easier to follow along. There are two buttons on the page, both of which dynamically create and append an <iframe> element to the page. They convey parameters to the frame via the fragment part of the URL; one button has no parameters and the other does, but they otherwise load the same content. The mysterious behavior that I was seeing was that the code path without parameters was causing the main window to scroll down (such that the iframe is at the top of the visible area).

With such a reduced test case it may already be obvious what's going on, but things were much less clear at the time that I encountered this. There were many possible causes since we had made a major frame-related infrastructure change when this started to happen. The only pattern was that it only seemed to affect WebKit-based browsers (i.e. Safari and especially our Mac app). After flailing for a while, I realized what I wanted most of all was a breakpoint. Specifically, if I could break in whatever function implemented page scrolling, then I could see what the trigger was. Some quick monkey-patching of the scrollTop window property showed that the scrolling was not directly initiated by JavaScript (indeed the bug could be reproduced entirely without JavaScript by inlining the iframe HTML directly). Therefore such a breakpoint needed to be on the native side (in WebKit itself) via lldb.

The first task was to attach a debugger to WebKit. It's been a few years since I've built it from source, and I didn't relish having to wait for the long checkout and build process. Unfortunately, lldb doesn't seem to want to be attached to Safari, presumably because System Integrity Protection (SIP) disallows debugging of system applications. Fortunately, nightly builds of WebKit are not protected by SIP, and they exhibited the same problem. To figure out which process to attach to (web content runs in a separate process from the main application), Apple's documentation revealed the helpful debug option to show process IDs in page title. Thus I was able to attach to the process rendering the problematic page:

$ lldb
(lldb) process attach --pid 15079
Process 15079 stopped
...

The next thing to figure out was what function to break in. Looking at the implementations of scrolling DOM APIs it looked like they all ended up calling WebCore::RenderObject::scrollRectToVisible, so that seemed like a promising choke point.

(lldb) breakpoint set -M scrollRectToVisible
Breakpoint 1: 2 locations.

(the output says that two breakpoints are set, since it also matches WebCore::RenderLayer::scrollRectToVisible, but that turned out to be a happy accident)

After using continue command to resume execution and reproducing the problem, I was very happy to see that my breakpoint was immediately triggered. I could then get the stack trace that I was after:

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.2
  * frame #0: 0x000000010753eda0 WebCore`WebCore::RenderObject::scrollRectToVisible(WebCore::SelectionRevealMode, WebCore::LayoutRect const&, bool, WebCore::ScrollAlignment const&, WebCore::ScrollAlignment const&)
    frame #1: 0x0000000106b5da64 WebCore`WebCore::FrameView::scrollToAnchor() + 292
    frame #2: 0x0000000106b55832 WebCore`WebCore::FrameView::performPostLayoutTasks() + 386
    frame #3: 0x0000000106b59959 WebCore`WebCore::FrameView::layout(bool) + 4009
    frame #4: 0x0000000106b5d878 WebCore`WebCore::FrameView::scrollToAnchor(WTF::String const&) + 360
    frame #5: 0x0000000106b5d659 WebCore`WebCore::FrameView::scrollToFragment(WebCore::URL const&) + 57
    frame #6: 0x0000000106b39c80 WebCore`WebCore::FrameLoader::scrollToFragmentWithParentBoundary(WebCore::URL const&, bool) + 176
    frame #7: 0x0000000106b389c8 WebCore`WebCore::FrameLoader::finishedParsing() + 120
    frame #8: 0x00000001069d3e0a WebCore`WebCore::Document::finishedParsing() + 266
    frame #9: 0x0000000106bfb322 WebCore`WebCore::HTMLDocumentParser::prepareToStopParsing() + 162
    frame #10: 0x0000000106bfc1b3 WebCore`WebCore::HTMLDocumentParser::finish() + 211
    ...

It looked like WebKit had decided to scroll to an anchor, which was surprising, since I wasn't expecting any named anchors in the document. After reading through the source of WebCore::FrameView::scrollToAnchor I finally understood what was happening:

// Implement the rule that "" and "top" both mean top of page as in other browsers.
if (!anchorElement && !(name.isEmpty() || equalLettersIgnoringASCIICase(name, "top")))
    return false;

As a side effect of the infrastructure change, the frame no longer had any parameters in the fragment part of the URL, but the code that was generating the URLs would always append a #. This empty fragment identifier would thus be marked as requesting a scroll to the top of the document. Once execution continued, we would end up in the previously-mentioned WebCore::RenderLayer::scrollRectToVisible method, which recurses into the parent frame, thus scrolling the whole document.

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x00000001074e0f80 WebCore`WebCore::RenderLayer::scrollRectToVisible(WebCore::SelectionRevealMode, WebCore::LayoutRect const&, bool, WebCore::ScrollAlignment const&, WebCore::ScrollAlignment const&)
    frame #1: 0x00000001074e143d WebCore`WebCore::RenderLayer::scrollRectToVisible(WebCore::SelectionRevealMode, WebCore::LayoutRect const&, bool, WebCore::ScrollAlignment const&, WebCore::ScrollAlignment const&) + 1213
    frame #2: 0x00000001074e143d WebCore`WebCore::RenderLayer::scrollRectToVisible(WebCore::SelectionRevealMode, WebCore::LayoutRect const&, bool, WebCore::ScrollAlignment const&, WebCore::ScrollAlignment const&) + 1213
    frame #3: 0x000000010753ee55 WebCore`WebCore::RenderObject::scrollRectToVisible(WebCore::SelectionRevealMode, WebCore::LayoutRect const&, bool, WebCore::ScrollAlignment const&, WebCore::ScrollAlignment const&) + 181
    frame #4: 0x0000000106b5da64 WebCore`WebCore::FrameView::scrollToAnchor() + 292
    frame #5: 0x0000000106b55832 WebCore`WebCore::FrameView::performPostLayoutTasks()
    ...

The fix was then trivial (remove the # if no parameters are needed), but it would have taken me much longer to find if I had treated the browser as a black box. As a bonus, reading through the WebKit source also introduced me to the “framesniffing” attack. The guards against this attack explained why the Mac app was most affected. There the main frame is loaded using a file:/// URL and based on WebKit's heuristics it can access any other origin, allowing the anchor scroll request to cross frame/origin boundary.

Disabling the click delay in UIWebView #

Historically, one of the differences that made hybrid mobile apps feel a bit “off” was that there would be lag when handling taps on UI elements with a straightforward click event handler. Libraries such as Fastclick were created to mitigate this by using raw touch events to immediately trigger the event handlers. Though they worked for basic uses, they added JavaScript execution overhead for touch events, which leads to jank.

More recently, both Chrome on Android and Safari on iOS have removed this limitation for pages that are not scalable. That was the fundamental reason why there was a delay for single taps — there was no way to know if the user was trying to do a double-tap gesture or a single tap, so the browser would have to wait after the first tap to see if another came.

I assumed that this would apply to web views embedded within apps, but I was disappointed to see that Quip's behavior did not improve on iOS 9.3 or 10.0 (we have our own Fastclick-like wrapper for most event handlers, but it didn't apply to checkboxes, and those continued to be laggy). Some more research turned up that the improvement did not apply to UIWebView (the older mechanism for embedding web views in iOS apps — WKWebView is more modern but still has some limitations and thus Quip has not migrated to it).

The WebKit blog post about the improvements included some links to the associated tracking bugs (as previously mentioned, WKWebView is entirely open source, which continues to be nice). Digging into one of the associated commits, it looked like this was a matter of tweaking the interaction between multiple UIGestureRecognizer instances. Normally the one that handles single taps must wait for the one that handles double taps to fail before triggering its action. Since the double tap one takes 350 milliseconds to determine if a tap is followed by another, it needs that long to fail for single taps. The change that Apple made was to disable this second gesture recognizer for non-scalable pages.

UIWebView is not open source, but I reasoned that its implementation must be similar. To verify this, I added a small code snippet to dump all gesture recognizers for its view hierarchy (triggered with [self dumpGestureRecognizers:uiWebView level:0]:

-(void)dumpGestureRecognizers:(UIView *)view level:(int)level {
    NSMutableString *prefix = [NSMutableString new];
    for (int i = 0; i < level; i++) {
        [prefix appendString:@"  "];
    }
    NSLog(@"%@ view: %@", prefix, view);
    if (view.gestureRecognizers.count) {
        NSLog(@"%@ gestureRecognizers", prefix);
        for (UIGestureRecognizer *gestureRecognizer in view.gestureRecognizers) {
            NSLog(@"%@   %@", prefix, gestureRecognizer);
        }
    }
    for (UIView *subview in view.subviews) {
        [self dumpGestureRecognizers:subview level:level + 1];
    }
}

This showed that the UIWebView contains a UIScrollView which in turn contains a UIWebBrowserView. That view has a few gesture recognizers, the most interesting being a UITapGestureRecognizer that requires a single touch and tap and has as the action a _singleTapRecognized selector. Sure enough, it requires the failure of another gesture recognizer that accepts two taps (it has the action set to _doubleTapRecognized, which further makes its purpose clear).

<UITapGestureRecognizer: 0x6180001a72a0; 
    state = Possible; 
    view = <UIWebBrowserView 0x7f844a00aa00>; 
    target= <(action=_singleTapRecognized:, target=<UIWebBrowserView 0x7f844a00aa00>)>; 
    must-fail = {
        <UITapGestureRecognizer: 0x6180001a7d20; 
            state = Possible; 
            view = <UIWebBrowserView 0x7f844a00aa00>; 
            target= <(action=_doubleTapRecognized:, target=<UIWebBrowserView 0x7f844a00aa00>)>; 
            numberOfTapsRequired = 2>,
        <UITapGestureRecognizer: 0x6180001a8180; 
            state = Possible; 
            view = <UIWebBrowserView 0x7f844a00aa00>; 
            target= <(action=_twoFingerDoubleTapRecognized:, target=<UIWebBrowserView 0x7f844a00aa00>)>; 
            numberOfTapsRequired = 2; numberOfTouchesRequired = 2>
    }>

As an experiment, I then added a snippet to disable this double-tap recognizer:

for (UIView* view in webView.scrollView.subviews) {
    if ([view.class.description equalsString:@"UIWebBrowserView"]) {
        for (UIGestureRecognizer *gestureRecognizer in view.gestureRecognizers) {
            if ([gestureRecognizer isKindOfClass:UITapGestureRecognizer.class]) {
                UITapGestureRecognizer *tapRecognizer = (UITapGestureRecognizer *) gestureRecognizer;
                if (tapRecognizer.numberOfTapsRequired == 2 && tapRecognizer.numberOfTouchesRequired == 1) {
                    tapRecognizer.enabled = NO;
                    break;
                }
            }
        }
        break;
    }
}

Once I did that, click events were immediately dispatched, with minimal delay. I've created a simple testbed that shows the difference between a regular UIWebView, a WKWebView and a “hacked” UIWebView with the gesture recognizer. Though the WKWebView is still a couple of milliseconds faster, things are much better.

Touch delay in various web views

Note that UIWebBrowserView is a private class, so having a reference to it may lead to App Store rejection. You may want to look for alternative ways to detect the gesture recognizer. Quip has been running with this hack for a couple of months with no ill effects. My only regret that is that I didn't think of this sooner, we (and other hybrid apps) could have had lag-free clicks for years.