Build it up #
Here are the changes that need to be made in order to get WNLIB running on Mac OS X:
- Add ./ to your path (if it's not already there, as is the case with the default OS X setup).
- Locate all
#include
s of<malloc.h>
and make them refer to<sys/malloc.h>
instead. - Optionally, to remove a warning, change
wn_system_memory_alloc_func_type
inwnmem.h
so that its argument issize_t
instead ofunsigned int
. This also requires including<stdlib.h>
and changing the type of the first parameter oflo_substitute_alloc
inselftest_aux.c
. - A bunch of files
#define
their own value ofINFINITY
. This is also#define
d in<math.h>
, so to remove the warning that results from this, we can just remove the#define
(WNLIB defines it asWN_FHUGE
(1.0e30), while<math.h>
usesHUGE_VALF
(1e50f) and so the two values are close enough). translate_errno
inwnio.c
seems to support two modes to convert error codes into a human readable message. One is to use thestrerror
function (if present) and the other is to look up the error code in thesys_errlist
system-wide global. Mac OS X seems to support both ways, but because of the way WNLIB tests to see ifstrerror
is available (looking iflinux
or__CYGWIN__
are defined), we default to the second mode. Unfortunately theextern
declaration that WNLIB doesn't match what is included in<stdio.h>
(there's aconst
missing). The simplest fix is to make it use thestrerror
way instead, which can be accomplished by testing for the__APPLE__
define in addition to all the others.
Getting the library to build is one thing, but actually using it requires more effort. Since it is compiled with a C compiler and I'm using C++, to get the symbol name mangling to be consistent I also had to wrap the #include
s in an extern "C" {}"
block. There is a wn_assert
in wntrnf.c
that checks if total_capacity(i_capacities,len_i)
and total_capacity(j_capacities,len_j)
are equal. Unfortunately we're dealing with floating point numbers here, and precision issues do creep in. Making that assert a bit more lenient (only so many digits have to be equal) is necessary. Other precision issues also appear, mostly when comparing values with zero (especially when decrementing peripheries). Replacing those with comparisons to some epsilon value allows the code to run with real world data. More precisely, it allows wn_trans_problem_feasible
, the first phase in solving the transport problem, to run. All this gives me is a feasible solution, but I would like an optimal one (or a close approximation thereof). This is done with the iterative function wn_trans_problem_simplex_improve
which unfortunately despite all of my attempts, refuses to run (various asserts fail). I have for now given up on getting WNLIB to run (although its iterative approach made it appealing, since I could presumably get better running times out of it, at the expense of precision).
To continue with my string of failures for the day, I next moved on extracting usable data from the IAM dataset that I recently downloaded. The problem is that the images have a top portion that is meant for OCR, and then in the lower three quarters or so the actual handwritten sample resides. Ideally I only want this second part, but since the text is of varied length, it doesn't always start in the same place. My thinking was to take advantage of the fact that three horizontal lines mark the upper/lower boundaries of these sections, and by looking for those (rows with very low average values) I could see where to crop. Unfortunately the images have different intensities (in some cases the person used light pencil, in others very thick and dark marker) and so it's hard to come up with a sure-fire way of detecting the lines. This isn't really as much of a failure as the WNLIB attempt, but it's still rather annoying that in the end nothing worked today.
Post a Comment