Programmatically accepting keyboard auto-corrections on iOS #

tl;dr: To programatically accept keyboard auto-corrections on iOS, call reloadInputViews on the first (current) UIResponder.

Quip's document editor is implemented via contentEditable. This is true not just for the desktop web version, but also for the iOS and Android apps. So far, this has been a good way of getting basic editing behavior from the browser for “free” on all platforms while also having enough control to customize it for Quip's specific needs.

One area where mobile editing behavior differs from the desktop is in the interaction with the auto-correction mechanisms that on-screen keyboards have. Normally auto-corrections are transparent to web content, but Quip needs to override the behavior of some key events, most notably for the return key. Since the return key also accepts the auto-correction, we needed a way to accept the auto-correction without actually letting the key event be processed by the contentEditable layer or the browser in general¹.

Kevin did some research into programmatically accepting auto-corrections, and it turned out that this could be done by temporarily swapping the firstResponder. He implemented this (though most of the editor is in JavaScript, we do judiciously punch holes² to native side where needed) and all was well.

However, a few months later, when we started to test Quip with the iOS 7 betas, we noticed that accepting auto-corrections no longer worked. Kevin went once more unto the breach. He observed that the next/previous form element buttons that iOS places above web keyboards (that we normally hide) also had the side-effect of accepting auto-corrections. He thus implemented an alternate mechanism on iOS 7 that simulated advancing to a dummy form elements and the going back.

Once the initial iOS 7 release was out the door and we had some time to regroup (and I had a train ride that I could dedicate to this), I thought I would look more into this problem, to see if I could understand what was happening better. The goal was to stop having two divergent code paths, and ideally find a mechanism with fewer side effects (switching the firstResponder resulted in the keyboard being detached from the UIWebView, which would sometimes affect its scroll offset).

The first step was to better understand how the iOS 6 and 7 mechanisms worked. Stepping through them with a debugger seemed tedious, but I guessed that a notification would be sent as part of the accept happening. I therefore added a listener that logged all notifications:

[NSNotificationCenter.defaultCenter addObserverForName:nil
                                                object:nil
                                                 queue:nil
                                            usingBlock:^(NSNotification *notification) {
    NSLog(@"notification: %@, info: %@", notification.name, notification.userInfo);
}];

This logged a lot of other unrelated notifications, but there was something that looked promising:

notification: UIViewAnimationDidCommitNotification, info: {
    delegate = "<UIKeyboardImpl: 0xea33dd0; frame = (0 0; 320 216); opaque = NO; layer = <CALayer: 0xea2bd80>>";
    name = UIKeyboardAutocorrection;
}

This looks like a (private) notification that is sent when the animation that shows the auto-correction is being committed. Since committing of animations happens synchronously, whatever triggered the accept must still be on the stack. I therefore changed the listener to be more specific:

[NSNotificationCenter.defaultCenter addObserverForName:@"UIViewAnimationDidCommitNotification"
                                                object:nil
                                                 queue:nil
                                            usingBlock:^(NSNotification *notification) {
    if (notification.userInfo && [@"UIKeyboardAutocorrection" isEqualToString:notification.userInfo[@"name"]]) {
        NSLog(@"commited auto-correction animation");
    }
}];

The log statement isn't that interesting in and of itself, but I used it as a place to add a breakpoint to it that logs the callstack³. Now I could see how accepting auto-corrections on iOS 6 worked (where we made a dummy UITextView become the first responder). That had a stack of the form:

....
#11: UIKit`-[UIKeyboardImpl acceptAutocorrection] + 141
#12: UIKit`-[UIKeyboardImpl setDelegate:force:] + 377
#13: UIKit`-[UIKeyboardImpl setDelegate:] + 48
#14: UIKit`-[UIPeripheralHost(UIKitInternal) _reloadInputViewsForResponder:] + 609
#15: UIKit`-[UIResponder(UIResponderInputViewAdditions) reloadInputViews] + 175
#16: UIKit`-[UIResponder(Internal) _windowBecameKey] + 110
#17: UIKit`-[UIWindow _makeKeyWindowIgnoringOldKeyWindow:] + 343
#18: UIKit`-[UIWindow makeKeyWindow] + 41
#19: UIKit`+[UIWindow _pushKeyWindow:] + 83
#20: UIKit`-[UIResponder becomeFirstResponder] + 683
#21: UIKit`-[UITextView becomeFirstResponder] + 385
...

Whereas on iOS 7, where we accepted it by hijacking the next/previous form control accessory buttons the path was:

...
#12: UIKit`-[UIKeyboardImpl acceptAutocorrection] + 197
#13: UIKit`-[UIKeyboardImpl setDelegate:force:] + 534
#14: UIKit`-[UIKeyboardImpl setDelegate:] + 48
#15: UIKit`-[UIPeripheralHost(UIKitInternal) _reloadInputViewsForResponder:] + 374
#16: UIKit`-[UIResponder(UIResponderInputViewAdditions) reloadInputViews] + 287
#17: UIKit`-[UIWebBrowserView assistFormNode:] + 265
#18: UIKit`-[UIWebBrowserView accessoryTab:] + 110
#19: UIKit`-[UIWebFormAccessory _nextTapped:] + 50
...

UIKeyboardImpl's acceptAutocorrection was the holy grail, but as a private API it may not be used — what I was looking for in these stack traces was a publicly callable method. A close reading (see the frames highlighted in blue) showed that there were (at least) two different triggers for accepting the auto-correction:

  1. The "key" UIWindow changing (the key window is the one that's receiving keyboard events)
  2. The UIResponder.reloadInputViews method (input (accessory) views are additions to the keyboard)

It therefore seemed worthwhile to try to trigger either one more directly. Looking at the UIApplication.sharedApplication.windows list, I saw that there were two windows (in addition to the main window, there was another of type UITextEffectsWindow, another private class). I could therefore simulate the key window changing by switching between them:

UIWindow *keyWindow = UIApplication.sharedApplication.keyWindow;
UIWindow *otherWindow = nil;
for (UIWindow *window in UIApplication.sharedApplication.windows) {
    if (window != keyWindow) {
        otherWindow = window;
        break;
    }
}
if (otherWindow) {
    [keyWindow resignKeyWindow];
    [otherWindow makeKeyWindow];
    [keyWindow makeKeyWindow];
}

That worked! But there was also the other approach to investigate. To call reloadInputViews, I needed to find the current first responder (it's not necessarily the UIWebView itself). I accomplished that by walking through the view hierarchy to find it. Sure enough, the first responder was a (private) UIWebBrowserView class and calling reloadInputViews on it accepted the correction.

Of the two approaches, the reloadInputViews approach seemed preferable, since it relied the least on undocumented behavior. The other approach assumed that there is always another UIWindow present, which doesn't necessarily seem safe, especially since the documentation says "an app has only one window" (it also cautions against invoking resignKeyWindow directly). Now that I knew what to search for, I could also see that reloadInputViews seems to have worked since at least the iOS 5 days. Finally, it also had the advantage of not causing any spurious scrolling due to the UIWebView (temporarily) losing the keyboard.

As you can see, I'm still learning my way around iOS programming. I'm finding that what takes the longest to learn is not the framework itself⁴. Rather, it's all of the various tricks and tools that are needed to debug perplexing or confusing behavior. I don't see any shortcuts for gaining that knowledge. Even if someone were to hand me a giant list of all UIKit gotchas, I wouldn't understand most of them, or I wouldn't be able to recall and apply them at the right time. I didn't learn web front end developement by just reading through a big list of browser quirks, and that's not how programming works in general.

  1. If the behavior that contentEditable provides is not exactly what we want, we generally prefer to bypass it altogether and inplement it ourselves, instead of letting contentEditable do its thing and then patching things up. The patching is brittle, especially across browsers.
  2. See this post on our JavaScript-to-native communication mechanism.
  3. Spark Inspector has fancy-looking notification tracing support, but I haven't tried it yet.
  4. When I do need to browse around UIKit (or any other documentation set), Dash has proven indispensable.

Post a Comment