My own private OPML repository #

I normally don't have much interest in the introspective projects that seem to abound in the web log world (e.g. BlogShares and DayPop's Top 40 - though I will admit to having a few Feedster search feeds in my aggregator). One thing that did catch my eye was Dave Winer's recent Share Your OPML site. I realize that this isn't an unique service, for example Feedster has its own Top 100 list, but more importantly Dave's site has an "SDK" that lets you get access to the underlying OPML data (with some restrictions).

As a first step to playing around with the data that's provided, having a local copy that's faster to access seems like a good idea. The most intuitive place to store this local copy of the OPML directory would be in a MySQL database, so I whipped up a Perl script to do this. It relies on having three tables, people, feeds and subscriptions. Their CREATE calls are:

CREATE TABLE `feeds` (`id` int(11) NOT NULL auto_increment, `name` blob NOT NULL, `url` blob NOT NULL, `subscribers` int(11) NOT NULL default '0', PRIMARY KEY (`id`))
CREATE TABLE `people` (`id` int(11) NOT NULL auto_increment, `name` blob NOT NULL, `url` blob NOT NULL, PRIMARY KEY (`id`))
CREATE TABLE `subscriptions` (`userID` int(11) NOT NULL default '0', `feedID` int(11) NOT NULL default '0')

Together with the opmlImporter.plXML::DOM, which, like almost every other XML parser on this planet, requires well-formed XML. Unfortunately a few (eight out of 709) OPML feeds have unescaped ampersands, so the script attempts to patch up those cases. Generally speaking, this goes against the usual XML modus operandi, but I've at least notified Dave about it so that things will hopefully be taken care of.

Post a Comment