I extended the paraphrasing capabilities for SHRD2, and we now get paraphrases for about 45% of the 180-ish examples in the current corpus. Beth Ann and I did a little experiment for the paper we presented at the SETQA-NLP workshop in Colorado last week. We each tried judging the corpus examples for correctness, using both the paraphrases and the underlying representations. In order to minimize learning effects, we permuted the order in which we did things.
Even though the "logic English" paraphrases seem very similar to the scoped logical representations, it in fact turns out that judging paraphrases is a lot faster. Even for me, knowing all the representations from having worked on them, judging paraphrases took 29 minutes, against 22 minutes for judging structures. For Beth Ann, who didn't know the datastructures previously, paraphrases were more than twice as quick. Part of the payoff comes from the fact that the paraphrase grammar acts as a filter; most ill-formed structures produce no paraphrase, hence don't need to be judged at all when paraphrases are used.
One person in the workshop audience said he was pleased to see a paper about software engineering which actually contained an experiment! Beth Ann was clearly right to insist that we do this, and work out the methodology.