Porting Twitter Digest to Google App Engine #

I've been meaning to play around with Google App Engine for a while, and as a quick project, I decided to port Twitter Digest to it (not as exciting as Kushal's Millidunst Calculator). This looked to be pretty straightforward: the original version was already in Python, and wasn't very complicated (just a single CGI script). It did indeed end up pretty easy; the whole process took a couple of hours.

The first step was to port the script from CGI-style invocation to the App Engine webapp framework. Then I looked into what it would take to get Python Twitter (the library I used for fetching data from Twitter) running. Switching it from urllib2 to urlfetch was pretty painless (though I don't use the posting parts of the API, so I didn't check if those work too). The other part of the library that I was relying on was its caching mechanism (since the digests are daily, there's no point in querying Twitter more often). DeWitt (the library's author) had thoughtfully put the caching functionality into a separate class, so it was easy to replace it with another one that implemented the same interface but was backed by App Engine's datastore.

The result (complete with App Gallery entry) is not that exciting, in the sense that it functions identically to the original. The only issue that I've run into so far is that when there are several cache misses, the URL fetches can take long enough that the request hits App Engine's deadline. However, since the successful fetches are cached, repeating the request will eventually succeed (so if consuming the digest via a feed, this shouldn't be a big deal). Ideally the urlfetch functionality would also support asynchronous fetches, since it would be easy to adapt the code to fetch all user timelines in parallel.

Update on 11/23/2008: Since I've gotten some requests for the modifications to twitter-digest that I made to get it to run on App Engine, here's a patch.

Intern on the Google Reader team #

Having interns has worked out well for the Reader team. Following my blog post, we were very pleased to get Nitin Shantharam and Jason Hall to help us out with Reader development. Their stints on the team resulted in a a bunch of features, and Jason is now back at Google working full-time (Nitin wasn't a slacker, he's just still in school).

We're looking for another intern or two this year. Internships generally last a couple of months to twelve weeks, are for full-time students, and would be in Google's Mountain View, California office. You can work on either Reader's backend (a C++ system for crawling millions of feeds, handling lots of items being read, shared, starred or tagged per second) or frontend (Java servers and JavaScript/AJAX-y craziness) depending on your interests and experience.

If you or anyone you know is interested in this internship, contact me at mihaip at google dot com. This page also has more general information about interning at Google.

persistent.coffee #

I've been trying my hand at latte art. Though I have a very long way to go, I've been documenting my efforts, with a hope of learning from my mistakes. Blogger's mobile support makes it pretty easy to collect pictures, and I've finally gotten around to making a decent template for the "blog."

coffee.persistent.info is the result. Technically, this isn't a Blogger template, since I just have some static HTML as the content. Instead, it uses the JSON output that Blogger's GData API supports. Rendering the page in JavaScript allows for more flexibility. I wanted to make pictures that I liked take up 4 slots (a layout inspired by TwitterPoster). This imposed additional constraints (in order to prevent overlap between sequential large pictures). The display is generally reverse-chronological starting from the top left, but images are occasionally shuffled around to prevent such overlaps. There is also a bit of interactivity, the pictures are clickable to display larger versions. To help with all this, I've been experimenting with jQuery (also on Mail Trends), and am liking it quite a bit.

Mail Trends #

I get a lot of email (especially at work). I'm trying a Inbox Zero-like approach in order to keep up with it. Though that's helping me to stay on top of things, I had the nagging feeling that I was probably on too many mailing lists, and that some of them were probably not worth it from a signal-to-noise ratio perspective.

Ideally something like the Reader Trends or Search History Trends page would exist for Gmail. I thought I could perhaps build it myself, but the absence of an official Gmail API deterred me. However, it occurred to me that the recently added IMAP support could act as an API of sorts. It should be easy to get just the message headers and slice and dice them to extract the stats that I was interested in.

Thus was born Mail Trends, an IMAP-based email analysis project. It can generate a bunch of tables, graphs and distributions based on time of day, senders, recipients, mailing lists, etc. To get a feel for what it can output, see the results of running it on a piece of the Enron Email Dataset. To run it over your own email, see the getting started page. As a caveat, the program currently loads everything into memory, so my run on 200,000 messages resulted in 1.6 gigabytes being used. You may want to use the --max_messages= flag to limit the dataset, at least for initial runs.

The project is still in its early stages, so patches and suggestions are definitely welcome (my email address is at the footer). You can also subscribe to the feed of check-ins to see changes as they are made. The plan wiki page has a very brief outline of what I'm planning on working on next.

Two Safari 3.1 Tips #

Safari 3.1 is out, and I've upgraded my Mac to it. Besides some issues with arrow keys in Reader (we're on it), it's working out well. Here are two hidden prefs that you may find useful:

defaults write com.apple.Safari IncludeInternalDebugMenu -bool true

The Develop menu that 3.1 includes is nice, but it seems to supplant the old "Debug" menu (i.e. the preference key that used to toggle it - IncludeDebugMenu - now toggles the "Develop" menu). The old menu had functionality that isn't present in the official one, most notably the "Caches" window that displayed the number of live JavaScript objects and made tracking down memory leaks much easier. If you'd like to bring back the old menu, you can use the new IncludeInternalDebugMenu key shown above

defaults write com.apple.Safari TargetedClicksCreateTabs -bool true

First spotted on Twitter, this forces new windows to open in tabs, one feature that I missed from my Firefox days.

Decommissioning Mscape Software #

Before I became enamored with web development, I used to be a Mac software developer under the Mscape Software moniker. My first public release (clip2cicn, a helper tool for making the icon resources necessary for Kaleidoscope schemes) was almost 10 years ago, on June 26, 1998. My flagship product was Iconographer, an icon editing tool.

I haven't had time for Mscape Software for about 4 years (the last Iconographer release was in July of 2003, and I last touched the code in early 2004). The site was still up, and I kept receiving registrations (Iconographer was a $15 shareware product). As time went on, I began to feel more and more guilty about not providing any support for (paying) customers. To this end, I have finally gotten around to dismantling Mscape Software, replacing the site with a placeholder with download links for all products. I've also put up a registration code for Iconographer so that the (annoying) registration reminder can be shut off.

I've gotten some requests to open-source Iconographer, but I'm not sure I'll have time even for that. I'm not 100% sure I can build the product given software on hand (it was built with CodeWarrior). Even then, the codebase shows its age (it was my first large-ish project and still uses many Classic constructs like a WaitNextEvent event loop).