Mail Scripting: Part Deux #

Now that I have determined what the best (i.e. fastest) way to get Perl talking to AppleScript it is, it's time to actually get some data out of it. My first attempt was very simplistic: have a handler export all the data that I needed as a string (in this specific example, we're getting the list of mailboxes, but a similar approach can work for getting any sort of data out of the script):

on «event MailGMbx»
    set output to ""
    tell application "Mail"
            repeat with i from 1 to mailboxCount
                set output to name of item i of allMailboxes & "\n"
            end repeat
    end tell
    return output
end «event MailGMbx»

This handler can be invoked with Mac::OSA::Simple's call method, and the result can be extracted by using Perl's split command. Some benchmarking revealed that this had reasonable performance for this particular case (there never are too many mailboxes), but that it degraded severely when dealing with more data. For example, getting a list of all the subjects, senders, dates, etc. of a mailbox with ~1000 messages took around 10 seconds. I initially thought that this was due to poor string handling in AppleScript (much in the same way that the String class in Java isn't meant for repeated appends, leaving that the string building tasks instead to StringBuffer). However, even not doing any appends in the loop (simply setting output to the current value) had similar behavior, performance-wise.

Since Mac::OSA::Simple's call function can handle lists as result values, I tried returning the name of every mailbox directly instead. Much to my surprise(or perhaps not - doing the iterations at a lower level was bound to be faster), that worked much better. In the mailbox case, the string building way took an average of 0.115 seconds per handler invocation, whereas returning a list directly took 0.0472 seconds. In even more demanding situations, such as the above-mentioned mailbox headers case, it was an order of magnitude faster.

Deviating a bit from the above, I also attempted a different method altogether. In my Googling yesterday, I came across Mac::Glue, an alternative way of bridging the Perl-AppleScript (or more correctly Perl-AppleEvents) gap. The module very cleverly loads the scripting dictionaries of applications, and creates a bit of glue code that allows interaction with AppleEvent objects directly from Perl, as described in this article. Thus there is no need to have an external script that is invoked from Perl; everything can be done from The One True Language. For example, the above mailbox-extracting case would be done as follows:

my @mailboxes = ();

my @mailboxesAE = $mailApp->prop("mailboxes")->get;

my $i = 0;
for my $mailboxAE (@mailboxesAE)
{
    my $mailboxRef = {id => $i++,
                      name => $mailboxAE->prop("name")->get};
    push(@mailboxes, $mailboxRef);  
}

return @mailboxes;

However, it turns out that, despite the greater ease of use (and flexibility) that Mac::Glue allows, it's not going to be feasible to use it in my project, for two reasons. The first is that performance seems to be much worse; the above code runs at 1.97 seconds per call, an order of magnitude slower than even the string based method above. More importantly, it also has memory leak issues, with the above example losing ~2 megabytes per call (there were 52 mailboxes in the list). Since I will be using this from within a daemon process that has a long lifespan, this would not be acceptable.

The developer's journal has the answer as to why this is the case. Mac OS X changed the way data could be extracted out of AppleEvent descriptors (AEDescs). Rather than simply getting a handle to the descriptor's data portion, one must now allocate some memory and request that the data be copied there (with the original being out of reach). Mac::Glue, having started life as a Mac OS 9-era MacPerl module, does not fully take this into account yet. That is, it does the copying necessary, but it never disposes of the data once the Perl object goes out of scope or is otherwise garbage collected. Until this is fixed, this module is of limited use to me.

Post a Comment