As a small step towards regaining my momentum for working on Thor, I've begun to hook up my current stroke extraction mechanism to a storage backend. Since this is a fundamentally a database of strokes, and I want to be able to do fast queries to eliminate strokes that are clearly irrelevant, MySQL seemed like the best choice, in terms of performance (versus what I could write), ease of integration and ease of deployment. Plus this gets me a distributed architecture for free, i.e. there's nothing stopping me from having multiple capture clients all dumping data into a centralized server. The C API is pretty simple (and looks very familiar after having worked with the Perl DBI module), and works pretty much as advertised under Mac OS X and with Xcode. As of now, I can save each image's captured strokes into a table, which is a decent enough start.
To generate the
INSERT query that would save the aforementioned data, I needed to generate some formatted strings. The STL
string class is a bit lackluster in this area, but Boost seems to have a nice collection of classes for this, as well a bunch of other tools. Getting it to run on Mac OS X was also a matter of downloading it and its build system (I was afraid that it would be a Linux-like stack of dependencies that's several levels deep, but my fears were unwarranted).
As I was doing this I realized that I haven't really given much thought to how strokes should be organized. Right now, I separate each delta frame into (pre-thinning) continuously connected sub-images, and extract strokes from each one of those. Separating things spatially is a good start, but is a bit too brittle. There are cases like loops where the pen pressure varies slightly, resulting in breaks. Since the separation between strokes is very small (on the level of one pixel), it seems like there should be some joining. Also, grouping only those strokes that are actually touching may be a bit too stringent, the Flatland paper uses bounding boxes, a much better idea.