Delicious to Google Bookmarks: Redux #

More than 4 years ago (!) I blogged about about a simple tool I'd written to export your Delicious* bookmarks to Google Bookmarks. With the recent confusion over Delicious's future the tool is receiving renewed interest. I had never updated it to support the V2 API (which is OAuth-based), and the blog post just kept accumulating comments from people that had merged their Delicious accounts with their Yahoo! IDs and couldn't use it anymore.

Having some free time this weekend, I decided to retrofit V2/OAuth support. The only gotcha encountered is that the V2 API requires HTTP, while the V1 API requires HTTPS, but if you get this wrong you get a misleading signature_invalid response. I also moved all the code to App Engine. Finally, all exported bookmarks get a "delicious-export" label, and ones that were private also get a "delicious-private" label.

* Or del.icio.us, as they were back then.

Update on 12/25/2010: The code is now available on GitHub.

Update on 2/17/2011: Google now has an official tool for importing from Delicious.

Visualizing DeviceOrientation Events #

I was looking to play around with the DeviceOrientation events (which are now supported by iOS devices, in addition to Chrome and Firefox), but I was having trouble picturing all the event fields, and how they would change in response to the device being moved.

DeviceOrientation events graphs

I therefore made a simple visualization of the deviceorientation and devicemotion events. It just plots the acceleration (or accelerationIncludingGravity if that's not available) and alpha, beta and gamma fields from the events. Incidentally, to get <canvas> to render at a 1:1 pixel mapping to the iPhone 4's high resolution screen, a similar technique to the one used for images can be used (i.e. make the canvas displayed size be half of its actual pixel dimensions).

Making git faster on large repositories on Mac OS X #

Update on 12/9/2010: Tim Harper, the maintainer of the pre-built Git OS X binaries, has switched to providing both i386 and x86_64 packages, thus there's no need to build your own anymore.

tl;dr version: Install 64-bit git binaries for Mac OS X for a performance boost when dealing with large repositories.

WebKit uses Subversion upstream, but thanks to long code review times and ChangeLog files (which must be updated for every commit, making it hard to have concurrent changes in the same checkout), using a Git checkout is necessary to maintain sanity.

Git is generally slower on Mac OS X, and this especially seems to be the case with WebKit. The most likely culprit is the repository size -- a WebKit checkout has more than 100K files (most of which are layout tests). For example, running git status at the root of the repository takes 6.58s using the standard pre-built binaries. This does a lot of stats and getdirentries. However, even doing this on a much smaller part of the repository wasn't as fast as it could be (comparing with a Linux machine), e.g. git status WebCore/fileapi takes 1.45s, even though that directory only has 110 files.

Running git status WebCore/fileapi through dtruss showed that it was doing lots of mmaps of 32 MB chunks of the main packfile (this was after a git gc, so there was only one packfile). Thanks to pointers from Evan Martin and Junio Hamano, I was pointed at the core.packedGitWindowSize config, which defaults to 32 MB on 32-bit platforms and 1 GB on 64-bit platforms. mmap is slower on Mac OS X, so having to do it more frequently (with a smaller window size) was problematic. Separately, it turns out that I was running 32-bit binaries, which are the only easily available ones for Mac OS X (64-bit binaries aren't provided since due to concerns about bloat.)

After I built a 64-bit binary, git status WebCore/fileapi went from 1.45s to 0.75s, and git status from 6.58s to 2.50s (these are averages of 4 consecutive runs after doing a warmup one, so that disk cache effects should be similar).

You can either download pre-built 64-bit binaries or build them yourself. The actual change was very straightforward.

Distortion Grid using CSS 3D Transforms #

I was recently wondering if it'd be possible to re-create the hottest demo of 2000 (specifically, the Mac OS X genie effect) inside a browser. More generally, it would be neat to have a grid-based distortion system. It would certainly be possible by drawing things inside a <canvas> and then applying distortions pixel-by-pixel (e.g. in the way that The Wilderness Downtown corrected for distortion). However, I was hoping to use CSS 3D Transforms so that actual application of distortions would be hardware-accelerated in browsers that supported it (<canvas> hardware acceleration is coming soon, but isn't quite here yet).

I then came across Wonder WebKit, which reminded me that it's possible to directly specify a matrix to use via (WebKit)CSSMatrix, and also provided a port of OpenCV's getPerspectiveTransform to JavaScript.

Mandrill distored

A couple of days of hacking later, I have a demo of distortion grids in action. It requires Safari 5.0 or Chrome dev channel (or other channels with accelerated compositing enabled). There's also a short screencast showing all the features in action.

The grid control points are drawn as plain DOM nodes, and made made draggable by handling mouse events (there's also basic touch event support, but multi-touch is not supported). The grid is rendered via a <canvas> overlay. The source data is divided up into tiles, each tile being defined by the four control points at its corners. When a point is moved, the perspective matrix that transforms from the source coordinates to the distorted ones is recomputed. When holding down option/alt, nearby points are also moved (less and less, with a 1.3^(Manhattan distance) decay factor). When holding down shift, other points are moved to maintain the overall aspect ratio.

A few different datasources are supported. The most obvious is an image, which is subdivided into tiles by drawing sections of it into smaller canvases. Iframes are also supported, albeit not in a very elegant fashion. The source iframe has to be cloned (and therefore loaded) once per tile. Something like the moz-element extension would allow the iframe to be drawn into each tile without actually having to clone it. Most interestingly, movies (via the <video> tag) can also be used as a source. They are treated quite similarly to images (each frame is drawn into a canvas, and then pieces of that are drawn into tiles). Maintaining 30 frames per second doesn't seem to be a problem, since once the matrices are set up, most of the video playback and transformation can be hardware-accelerated.

Unfortunately I couldn't quite replicate the full animated genie effect. Though it is possible to snapshot transformed tiles and use CSS Transitions to animate between then, the interpolation between matrices doesn't behave quite as expected, and seams appear. It therefore seems like animation would have to be done by hand, with control point positions being interpolated and then matrices being recomputed every frame. Especially with a finer grid, maintaining 30 fps was deemed more trouble than it was worth (a.k.a. I got lazy).

Chrome Performance Puzzler #

I was recently involved* in investigating a Chrome performance issue that I thought was worth sharing.

This page has a simple CSS 2D animation, with a ball moving back and forth (taking a second to go across the screen). At the left and right endpoints, the location fragment is updated to #left and #right. The intent is to simulate a multi-section page with transitions between the section and each section having a bookmarkable URL.

The puzzling part is that the animation skips a few frames in the middle (right as the ball is crossing the thin line). This may not be noticeable depending on your setup, so here's a video of it. This only happens in Chrome, not in Safari or WebKit nightlies.

Here's some hints as to what's going on, each revealing more and more.

  1. The jerkiness always happens 500ms into the animation (which is the halfway point in the one second version, but one quarter of the way into the 2 second version.
  2. Even though the animating area stays the same, the larger the window size, the bigger the hiccup.
  3. The inspector's timeline view for the page shows regular, small, evenly-spaced repaints, but then suddenly 500ms into the animation, there's a full screen repaint, followed by a large gap before updates resume.
  4. Taking out the location fragment update fixes the jerkiness.
  5. Chrome has a few things that happen with a 500ms delay.
  6. Watching the counter on about:histograms/Renderer4.Thumbnail is helpful.

As it turns out, what's happening is that 500ms after a load finishes, Chrome captures the current page, so that it can be shown on the New Tab Page. This includes getting a thumbnail of the page (which involves repainting all of it and then scaling it down using a high quality filter). Updating the location with the fragment triggers this logic, and the larger the window, the more time is spent painting the page and then scaling it down.

In addition to fixing this on the Chrome side, the best way to avoid this is to update the location at the end of a transition, instead of at the beginning.

* Credit for figuring out what was going on goes to James Robinson.

Bloglines Express, or How I Joined The Google Reader Team #

Since Bloglines is shutting down on November 1, I thought it might be a good time to recount how I joined the (nascent) Google Reader team thanks to a Greasmonkey script built on top of Bloglines.

It was the spring of 2005. I had switched from NetNewsWire to Bloglines the previous fall. My initial excitement at being able to get my feed fix anywhere was starting wear off -- Bloglines was held up by some as a Web 2.0 poster child, but the site felt surprisingly primitive compared to contemporary web apps. For example, such a high-volume content consumption product begged for keyboard shortcuts, but the UI was entirely mouse-dependent. I initially started to work on some small scripts to fill in some holes, but fighting with the site's markup was tiring.

I briefly considered building my own feed reader, but actually crawling, storing and serving feed content didn't seem particularly appealing. Then I remembered that Bloglines had released an API a few months prior. The API was meant to be used by desktop apps (NetNewWire, FeedDemon and BlogBot are the initial clients are mentioned in the announcement), but it seemed like it would also work for a web app (the API provided two endpoints, one to get the list subscriptions as OPML, and one to get subscription items as RSS 2.0).

This was also the period when Greasemonkey was really taking off, and I liked the freedom that Greasemonkey scripts provided (piggyback on someone else's site and let them do the hard work, while you can focus on the UI only). However, this was before GM_xmlhttpRequest, so it looked like I'd need a server component regardless, in order to fetch and proxy data from the Bloglines API.

Then, it occurred to me that there was no reason why Greasemonkey had to inject the script into a "real" web page. If I targeted the script at http://bloglines.com/express (which is a 404) and visited that URL, the code that was injected could make same-origin requests to bloglines.com and have a clean slate to work with, letting me build my own feed reading UI.

Once had the basic framework up and running, it was easy to add features that I had wanted:

  • Gmail-inspired keyboard shortcuts.
  • Customized per-item actions, for example for finding Technorati and Feedster backlinks, or posting to Del.icio.us (cf. "send to" in Reader).
  • Specialized views for del.icio.us and Flickr feeds (cf. photo view in Reader).
  • Inline viewing of original page content (including framebuster detection).

A few weeks into this, I saw an email from Steve Goldberg saying that a feed reading project was starting at Google. I got in touch with him about joining the team, and also included a pointer to the script at that state. I don't know if it helped, but it clearly didn't hurt. As it turned out, Chris Wetherell, Jason Shellen, Laurence Gonsalves and Ben Darnell all had (internal) feed reading projects in various states; Reader emerged out of our experiences with all those efforts (Chris has a few more posts about Reader's birth).

Once the Reader effort got off the ground it seemed weird to release a script that was effectively going to be a competitor, so it just sat in my home directory (though Mark Pilgrim did stumble upon it when gathering scripts for Greasemonkey Hacks). However, since Bloglines will only be up for a few more days, I thought I would see if I could resurrect the Greasemonkey script as a Chrome Extension. Happily, it seems to work:

  1. Install this extension (it requires access to all sites since it needs to scrape data from the original blogs
  2. Visit http://bloglines.com/listsubs to enter/cache your HTTP Basic Auth credentials for the Bloglines API.
  3. Visit http://persistent.info/greasemonkey/bloglines-express/ to see your subscriptions (unfortunately I can't inject content into bloglines.com/express since Chrome's "pretty 404" kicks in.

Or if all that is too complicated, here's a screencast demonstrating the basic functionality:

For the curious, I've also archived the original version of the Gresemonkey script (it actually grew to use GM_xmlhttpRequest over time, so that it could load original pages and extra resources).

Somewhat amusingly, this approach is also roughly what Feedly does today. Though they also have a server-side component, at its core is a Firefox/Chrome/Safari extension that makes Google Reader API requests on behalf of the user and provides an alternative UI.

New Toy #

Working on client software means that it's easier to justify getting a new toy tool to work from home with. Since the Mac Pros have just been revised, I took this opportunity to upgrade from my mid-2006 model to a more recent one. After jumping through some hoops, I finally have it all set up, and out of curiosity I compared the time it took to do a clean WebKit build times against the old machine (and my work setup):

Old Mac Pro
2x dual-core 2.66 GHz Xeon Woodcrest, 6 GB RAM
24m16s
Work Mac Pro
2x quad-core 2.26 GHz Xeon Gainestown, 12 GB RAM
10m09s
New Mac Pro
2x 6-core 2.66 GHz Xeon Gulftown, 16 GB RAM
6m15s

CPU usage

Apple actually uses WebKit compilation time as a benchmark on their own page, so it certainly does show off the many cores of the machine. Not all build steps are perfectly parallelizable though, so the CPU usage meter isn't quite as redlined all the time as the above screenshot would indicate.

Replicating Flipboard's "page fold" animation with CSS 3D transforms and transitions #

Flipboard is an iPad app that has a distinctive "page fold" animation when going from page to page (visible in their intro video). I was curious if it's possible to replicate this effect using CSS 3D transforms and transitions. It seems to be (the demo works in Safari 5.0, WebKit nightly builds and to some degree in Chrome with accelerated compositing enabled)

The trickiest part about this was that 3D transforms are applied to the whole DOM node, but we only want to transform half of it for each fold. The best way to accomplish that for now is to clone the node to be transformed and then only show half of each clone, transforming each one separately. You can see that in a variant of the demo where "creases" are enabled (each half is outlined in a different color).

Moving down the stack #

After more than 5 years (!) of working on the Google Reader team, I'm switching to working on Google Chrome, and specifically the "web platform" piece that uses WebKit.

Reader has heen a lot of fun, I got a lot of things done and got to interact with a lot of users. Though Reader is by no means done, I've been having an "itch" lately to break out of my frontend development niche (where I've been working at higher and higher levels of abstraction lately).

I think there's still a value in knowing how things work at lower levels (even if there's way too much to know), and as much I told myself that I should skim through the Apache, Linux kernel or V8 sources to understand how they work, I never quite had the time. Now that web browsers are approaching OS levels of complexity, it seems like a good time to be forced to dive in.

Of course, the hope is that my web development background will help inform my Chrome/WebKit work, and I'm not the first to make this transition. It'll also be an interesting transition to be working on an open source project, with my (so far not very interesting) commits visible to the world.

Partychat Hacking #

I've been a user of Partychat for a few years now, both in its previous hosted-on-a-server-in-someone's-living-room implementation and in its current App Engine-based one.

Needing a fun side project (and having a few itches that I needed scratches), I've spent quite a few evenings over the past month making changes to the source code. So far I've managed to:

/share command
  • Make the homepage friendlier (and slightly prettier), especially the room creation form.
  • Improve web-based room management (building on what David started), so that you can join, leave, or request an invitation from the room's web page (here's a sample room for the current developers).
  • Prettify the plusplusbot display in the web UI.
  • Add a /share command that makes it easy to share URLs and give some context.
  • Clean up some data layer issues and fix some inconsistencies.
  • And of course, quite a few bug fixes (nothing like scanning the AppEngine logs looking for NPEs).

Like with all fun side-projects projects, they can also turn not fun once usage picks up. For example, yesterday a user invited room A as a member of room B, and vice-versa. As an unintentional side-effect of a change that I made to make the setup process more user-friendly, it became very easy to set off an infinite loop of messages. By the time I realized what was going on, we'd run out of App Engine bandwidth quota. Thankfully after some budget rejiggering and a some quick code change all was well. Similarly, a couple of weeks ago we had CPU usage issues (though optimizing that away was still fun).

It seems like chat is gaining in trendyness, with HipChat launching recently (presumably to challenge the incumbent Campfire) and Brizzly/Thing Labs's Picnics happening. Partychat will probably remain a hobby/fit-within-AppEngine-free-quota sort of project indefinitely, but it's been fun polishing it (and of coure, there's still quite a few things left to fix).

And of course, if you need IRC 2.0 a persistent chat room you should give it a try.

Google Reader Play Bookmarklet #

It occurred to me that it'd be pretty easy to make a bookmarklet for the recently-launched Google Reader Play:

PlayThis!

All it does is take the current page's feed and display it in the Play UI. You may find this useful when discovering a new photo-heavy site (or anything else with a feed, like a Flickr user page), or when you want to share, star or like an item from a site you're not subscribed to (you can also use the regular Reader subscribe bookmarklet for that).

P.S. If you're reading this in a feed/social content reader, you'll most likely have to view the original post, as the javascript: URL on the bookmarklet link will no doubt get sanitized away.