Accidental DDoSes I Have Known #

A couple of weeks I was migrating some networking code in Quip's Mac app from NSURLConnection to NSURLSession when I noticed that requests were being made significantly more often than I was expecting. While this was great during the migration (since it made exercising that code path easier), it was unexpected: the data should only have been fetched once and then cached.

After some digging, it turned out that we had a bug in the custom local caching system that sits in front our CDN (CloudFront), which we use to serve profile pictures and other non-authenticated data. Due to a catch-22 in the cache key function (which made it depend on the HTTP response), all assets would initially not be found in the local cache, and would incur a network request. The necessary data was then stored in memory, so until the app was restarted the cache would work as expected, but in the next session they would get requested again.

Chart of CloudFront requestsIt turned out that this bug had been introduced a few months prior, but since it manifested itself as a little bit of extra traffic to an external service, we didn't notice it (the only other visible manifestation would be that profile pictures would load more slowly during app startup, or be replaced with placeholders if the user happened to be offline, but we never got any reports of that).

This chart (of CloudFront requests from “Unknown” browsers, which is how our native apps are counted) shows the fix in action; the Mac app build with it was released on November 30th and was picked up by most users over the next few days.

This kind of low-grade accidental DDoS reminded of a similar bug that I investigated a few years ago at Google, while working on Chrome Extensions. A user had reported that the Gmail extension for Chrome (which my team happened to own, since we provided it as sample code) would end up consuming a lot of memory (and eventually be terminated) if the Gmail URL that it tried to fetch data from was blocked by filtering software. After some digging it turned out that the extension would enqueue two retries for every failed failure response, due to code along these lines:

var xhr = new XMLHttpRequest();

... // Send off request

function handleError() {
    ... // schedule another request
}

xhr.onreadystatechange = function() {
    .. // Various early exits if success conditions are met
    
    handleError();
};

xhr.onerror = function() {
   handleError();
};

The readystatechange event always fires, including for error states that also invoke the error event handler. This behavior meant that it would quickly escalate from a request every few minutes to almost one request per second, depending on how long it remained in the blocked state. The fix turned out to be trivial, and since this was a separate package distributed via the Chrome Web Store that gets auto-updated, we could quickly fix the millions of users that had it installed.

It then occurred to me that this would not just affect users where the Gmail URL was blocked, but any user that had spotty connectivity — any HTTP failure would result in a doubling of background requests. I then called up a requests-per-second graph of the Atom feed endpoint for Gmail (which is what the extension used), and saw that it had dropped by 20,000 requests per second over the day or so that it took for the extension update to propagate.

The upshot of all this is that Google Reader at its peak had about 10,000 requests per second, thus making my overall traffic contribution to Google net negative.

Some Observations Regarding JavaScriptCore's Supported Platforms #

SquirrelFish

JavaScriptCore (JSC) is the JavaScript engine that powers WebKit (and thus Safari). I was recently browsing through its source and noticed a few interesting things:

ARM64_32 Support

Apple Watch Series 4 uses the S4 64-bit ARM CPU, but running in a mode where pointers are still 32 bits (to save on the memory overhead of a 64 -bit architecture). The watch (and its new CPU) were announced in September 2018, but support for the new ARM64_32 architecture was added in December 2017. That the architecture transition was planned in advance is no surprise (it's been in the works since the original Apple Watch was announced in 2015). However, it does show that JSC/WebKit is a good place to watch for future Apple ISA changes.

ARMv8.3 Support

The iPhone XS and other new devices that use the A12 CPU have significantly improved JavaScript performance when compared to their predecessors. It has been speculated that this is due to the A12 supporting the ARM v8.3 instruction set, which has a new floating point instruction that operates with JavaScript rounding semantics. However, it looks like support for that instruction was only added a couple of weeks ago, after the new phone launch. Furthermore, the benchmarking by the Apple engineer after the change landed showed that it was responsible for a 0.5%-2% speed increase, which while nice, does not explain most of the gain.

Further digging into the JSC source led to my noticing that JIT for the ARMv8.3 ISA (ARM64E in Apple's parlance) is not part of the open source components of JSC/WebKit (the commit that added it references a file in WebKitSupport, which is internal to Apple). So perhaps there are further changes for this new CPU, but we don't know what they are. It's an interesting counterpoint to the previous item, where Apple appears to want extra secrecy in this area. As a side note, initial support for this architecture was also added several months before the announcement (and references to ARM64E showed up more than 18 months earlier), thus another advance notice of upcoming CPU changes.

Fuschia Support

Googler Adam Barth (hi Adam!) added support for running JSC on Fuschia (Google's not-Android, not-Chrome OS operating system). Given that Google has its own JavaScript engine (V8), it's interesting to wonder why they would also want another engine running. A 9to5 Google article has the same observation, and some more speculation as to the motivation.

Google Reader: A Time Capsule from 5 Years Ago #

Google ReaderIt's now been 5 years since Google Reader was shut down. As a time capsule of that bygone era, I've resurrected readerisdead.com to host a snapshot of what Reader was like in its final moments — visit http://readerisdead.com/reader/ to see a mostly-working Reader user interface.

Before you get too excited, realize that it is populated with canned data only, and that there is no persistence. On the other hand, the fact that it is an entirely static site means that it is much more likely to keep working indefinitely. I was inspired by the work that Internet Archive has done with getting old software running in a browser — Prince of Persia (which I spent hundreds of hours trying to beat) is only a click away. It seemed unfortunate that something of much more recent vintage was not accessible at all.

Right before the shutdown I had saved a copy of Reader's (public) static assets (compiled JavaScript, CSS, images, etc.) and used it to build a tool for viewing archived data. However, that required a separate server component and was showing private data. It occurred to me that I could instead achieve much of the same effect directly in the browser: the JavaScript was fetching all data via XMLHttpRequest, so it should just be a matter of intercepting all those requests. I initially considered doing this via Service Worker, but I realized that even a simple monkeypatch of the built-in object would work, since I didn't need anything to work offline.

The resulting code is in the static_reader directory of the readerisdead project. It definitely felt strange mixing this modern JavaScript code (written in TypeScript, with a bit of async/await) with Reader's 2011-vintage script. However, it all worked out, without too many surprises. Coming back to the Reader core structures (tags, streams, preferences, etc.) felt very familiar, but there were also some embarrassing moments (why did we serve timestamps as seconds, milliseconds, and microseconds, all within the same structure?).

As for myself, I still use NewsBlur every day, and have even contributed a few patches to it. The main thing that's changed is that I first read Twitter content in it (using pretty much the same setup I described a while back), with a few other sites that I've trained as being important also getting read consistently. Everything else I read much more opportunistically, as opposed to my completionist tendencies of years past. This may just be a reflection of the decreased amount of time that I have for reading content online in general.

NewsBlur has a paid tier, which makes me reasonably confident that it'll be around for years to come. It went from 587 paid users right before the Reader shutdown announcement to 8,424 shortly after to 5,345 now. While not the kind of up-and-to-right curve that would make a VC happy, it should hopefully be a sustainable level for the one person (hi Samuel!) to keep working on it, Pinboard-style.

Looking at the other feed readers that sprung up (or got a big boost in usage) in the wake of Reader's shutdown, they all still seem to be around: Feedly, The Old Reader, FeedWrangler, Feedbin, Innoreader, Reeder, and so on. One of the more notable exceptions is Digg Reader, which itself was shut down earlier this year. But there are also new projects springing up like Evergreen and Elytra and so I'm cautiously optimistic about the feed reading space.

Efficiently Loading Inlined JSON Data #

I wrote up a post on the Quip blog about more efficiently embedding JSON data in HTML responses. The tl;dr is that moving it out of a JavaScript <script> tag and parsing it separately with JSON.parse can significantly reduce the parse time for large data sizes.