How I Consume Twitter #

In light of the Twitter API 1.1 announcement and the surrounding brouhaha, I thought I would take a moment to document how I read Twitter, since it might all have to change1.

It shouldn't be surprising at all that I consume Twitter in Google Reader. I do this with the aid of two tools that I've written, Bird Feeder2 and Tweet Digest3.

Bird Feeder in Google Reader screenshotBird Feeder lets you sign in with Twitter, and from that generates a "private" Atom feed out of your "with friends" timeline. It tries to be reasonably clever about inlining thumbnails and unpacking URLs, but is otherwise a very basic client. The only distinctive thing about it is that it uses a variant of my PubSubHubbub bridge prototype to make the feed update in near-realtime4. What makes it my ideal client is that it leverages all of Reader's capabilities: read state, tagging, search (and back in the day, sharing). Most importantly, it means I don't need to add yet another site/app to my daily routine.

In terms of the display guidelinesrequirements, Bird Feeder runs afoul of a bunch of the cosmetic rules (e.g. names aren't displayed, just usernames), but those could easily be fixed. The rule that's more interesting is 5a: "Tweets that are grouped together into a timeline should not be rendered with non-Twitter content. e.g. comments, updates from other networks." Bird Feeder itself doesn't include non-Twitter content, it outputs a straight representation of the timeline, as served by Twitter. However, when displayed in Reader, the items representing tweets can end up intermixed with all sorts of other content.

Bird Feeder lets me (barely) keep up with the 170 accounts that I follow, which generate an average of 82 tweets per day (it's my highest volume Reader subscription). However, there are other accounts that I'm interested in which are too high-volume to follow directly. For those I use Tweet Digest, which batches up their updates into a once-a-day post. I group accounts into digests by theme using Twitter lists (so that I can add/remove accounts without having to resubscribe to the feeds). It adds up to 54 accounts posting an average of 112 tweets per day.

This approach to Twitter consumption is very bespoke, and caters to my completionist tendencies. I don't expect Twitter's official clients to ever go in this direction, so I'm glad that the API is flexible enough to allow it to work, and hopefully that will continue to be the case.

  1. Though I'm hoping that I'm such an edge case that Twitter's Brand Enforcement team won't come after me.
  2. Bird Feeder is whitelisted for Ann's and my use only. This is partly because I don't want it to attract Twitter's attention, and partly because I don't need yet another hobby project ballooning into an App Engine budget hog. However, you're more than welcome to fork the code and run your own instance.
  3. It's amazing that Tweet Digest is almost 5 years old. Time flies.
  4. This was a good excuse to learn Go. Unfortunatley, though I liked what I saw, I haven't touched that code (or any other Go code) in 6 months, so nothing has stuck.

Protecting users from malware via (strict) default settings #

One of the features in Mountain Lion, Apple's newest OS X release, that has gotten quite a bit of attention is Gatekeeper. It's a security measure that, in its default configuration, allows only apps downloaded from the Mac App Store or signed with an Apple-provided (per-developer) certificate to run. This a good security move that makes a bunch of people happy. The assumption is that, though Gatekeeper can be turned off, it's on by default, so it will be a great deterrent for malware authors. For example, here's an excerpt from John Siracusa's Mountain Lion review:

All three of these procedures—changing a security setting in System Preferences, right-clicking to open an application, and running a command-line tool—are extremely unlikely to ever be performed by most Mac users. This is why the choice of the default Gatekeeper setting is so important.

However, a cautionary tale comes from the web security world. The same-origin policy is an inherent1 property of the web. This means that, barring bugs, it shouldn't be possible to have cross-site scripting (XSS) not allowed by the host site. But at the same time that scripting ability was added to browsers, the javascript: URL scheme was introduced, which allowed snippets of JavaScript to be run in the context of the current page. These could be used anywhere URLs were accepted (leading to bookmarklets), including the browser's location bar.

In theory, this feature meant that users could XSS themselves by entering and running a javascript: URL provided by an attacker. But surely no one would just enter arbitrary code given to them by a disreputable-looking site? As it turns out, enough of people do. There is a class of Facebook worms that spread via javascript: URLs. They entice the user with a desired Facebook feature (e.g. get a "dislike" button) and say "all you have to do to get it is copy and paste this code into your address bar and press enter."2 Once the user follows the instructions, the attacker is able to impersonate the user on Facebook.

If the target population is big enough, it doesn't matter what the default setting is, or how convoluted the steps are to bypass it. 0.1% of Facebook's ~1 billion users is still 1 million users. In this particular case, browser vendors are able to mitigate the attack. Chrome will strip a javascript: prefix from strings pasted into the omnibox, and I believe other modern browsers have similar protections. For the attacker's perspective, working around this involves making the "instructions" even more complicated, leading to hopefully a large drop-off in the infection success rate, and perhaps the dropping of the attempt altogether.

This isn't to say that Gatekeeper as deployed today will not work. It's just that it'll take some time before the ease-of-use/configuration and security trade-offs can be evaluated. After all, javascript: URLs were introduced in 1995, and weren't exploited until 2011.

  1. So inherent that it was taken for granted and not standardized until the end of 2011.
  2. I'm guessing that it's not helpful that legitimate sites occasionally instruct users to do the same thing.