Twitter PubSubHubbub Bridge #
During the Twitter DDoS attacks, there was a thread on the Twitter API group about using PubSubHubbub to get low latency notifications from Twitter. This would be an alternative to the streaming API that Twitter already has. The response from a Twitter engineer wasn't all that positive, and it is indeed correct that the streaming API already exists and seems to satisfy most developers' needs.
However, my interest was piqued and I thought it might be a useful exercise to see what Twitter PubSubHubbub support could look like. I therefore decided to write a simple bridge between the streaming API and a PubSubHubbub hub. The basic idea was that there would be a simple streaming client that would in turn publish events to a hub. The basic flow would be:
(created using Kushal's Diagrammr)
I'm using FriendFeed as the PubSubHubbub client, but obviously anything else could substitute for it. The "publisher" is where the bulk of the work happens. It uses the statuses/filter streaming API method* to get notified of when a user of interest has posted, and then it notifies the reference hub that there is an update. It also has a companion Google App Engine app that serves feeds for Twitter updates. This is both because the hub needs a feed to crawl and because the feed needs to have a <link rel="hub">
element, something which Twitter's own feeds don't have. Unfortunately the publisher itself can't run on App Engine since the streaming API requires long-lived HTTP connections, and App Engine will not let requests execute for more than 30 seconds. I considered using the tasks queue API to create a succession of connections, but that seemed too hacky.
In any case, it all seems to work, as this screencast shows:
On the right is the Twitter UI where messages are posted. In the middle is the publisher which receives these messages and relays them to the hub. On the left is FriendFeed which gets updates from the Hub.
Latency isn't great, and as mentioned in the group thread, Twitter could have to deal with the hub being slow. Part of the reason why latency isn't great is because the hub has to crawl the feed to get at the update, even though the publisher already knows exactly what the update is. This could be fixed by running a custom hub (possibly even by Twitter, see the hub can be integrated into the publisher's content management system option), with the flow becoming something like:
In the meantime, here's the source to both the publisher and the app.
* This was called the "follow" method until very recently.
9 Comments
EULA though."
That's what you're doing here, right? I haven't read through the EULA, but I'd be interested in the conflict (since I'd love to build this funcionality into Twitalytic).
If you're interested Mihai check it out:
http://friendfeed.com/openff
http://openff.org/wiki
@gina, I haven't been able to find this EULA, presumably Kalucki refers to the agreement that you sign when you get broader access to the API (I'm using the endpoint that has a maximum of 400 users to track).
was googling the very same thing you have made, and came across your post.
Any changes on implementing this now or has everything remained the same?
also (noob question alert), why does the publisher send to both the hub and the app?
As for this getting made, I haven't seen any change in the EULA/ToS from Twitter, but I haven't followed this that closely either.
Post a Comment