Serendipity #

There are two big things left to do for this project, implementing shape contexts as an alternative method to compare against and finding a dataset to do the comparison with. Luckily enough, the solutions to both of those problems emerged simultaneously when reading one of the shape context papers, "Shape contexts enable efficient retrieval of similar shapes". In addition to describing a neat clustering/quantization approach that should help with performance (and make it easy to integrate with my system), they also talked about the ways in which they evaluated their system. Unlike other papers, say this one, they did not use a homemade dataset. Rather they used the MPEG-7 silhouette database, a 3D model database, and a standard set of images used in psychology made by Snodgrass and Vanderwart. When searching for this last one, I was disappointed to discover that it was not freely available (I'm sure our psychology department has a copy, but it's too much of a bother). However, I did find two other interesting looking datasets, one of objects and one of actions. Both are hand-drawn, and perhaps in some cases a bit more complex than what would be seen on a whiteboard, but not by much. Best of all, parts of both are freely available for downloading, therefore it looks like I am all set (I will most likely use the object one since the images it contains better match expected whiteboard content). Like the shape context paper, I will be applying various transformations to each image to get several derivatives, so that it is easy to measure precision/recall.

I have also been doing some cleaning up and tweaking to make my life easier. When viewing imported strokes I can zoom in/out now, pan and turn on and off point displaying. This makes checking out stroke extraction quality much easier (as opposed to having to recompile without point displaying and/or size normalization). Another thing that had been bothering me was that some thinned images would not get displayed at all. This turned out to be because I always add a one pixel padding to all stroke images, to ensure that all border pixels are white (makes life much easier when testing pixels, since I don't have to worry about ones on the edge). When the stroke image was as big as the original image, then it would have an x or y coordinate of -1. I'm using glDrawPixels to actually display the image, and this offscreen starting coordinate would cause it to clip the entire image. Thankfully a bit of Googling turned up a glBitmap-based workaround that seems to do the trick.

Post a Comment