<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-3502837936335259853</id><updated>2011-08-03T02:13:06.534-07:00</updated><category term='Sicstus'/><category term='Bridge'/><category term='Evaluation'/><category term='documentation'/><category term='interlingua'/><category term='grammar specialisation'/><category term='English'/><category term='Cookbook'/><category term='CALL-SLT'/><category term='Calendar'/><category term='Catalan'/><category term='efficiency'/><category term='dynamic lexicon'/><category term='AFF'/><category term='bidirectional'/><category term='Swedish'/><category term='parsing'/><category term='MedSLT'/><category term='help'/><category term='GUI'/><category term='French'/><category term='grammar'/><category term='Romance'/><category term='dialogue'/><category term='Nuance-9'/><category term='DORIS'/><category term='top-level'/><category term='2.9.0'/><category term='SemER'/><category term='Regserver'/><category term='SHRD2'/><category term='release'/><category term='treebanking'/><category term='Scandinavian'/><category term='Japanese'/><category term='paraphrasing'/><category term='dialogue-server'/><category term='EBL'/><category term='n-best'/><title type='text'>Regulus News</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>78</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-8982863378814462553</id><published>2010-08-04T05:01:00.000-07:00</published><updated>2010-08-04T05:07:57.044-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='EBL'/><title type='text'>EBL mode</title><content type='html'>Yet another thing I should have done years ago. As everyone knows who's tried to do it, writing operationality criteria is rather messier than it should be. One of the worst things is that you don't immediately get feedback on the effects of the changes you've made. You need to rebuild the whole specialised grammar, which typically takes a while, and then you have to go through it to find out what the effect of your changes was.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I've just added an enhancement that should substantially simplify the development process. The new command EBL_MODE puts the top-level in a mode where each input sentence is subjected to EBL analysis, using the current set of operationality criteria; the operationality criteria file is reloaded each time, in case it has changed. The system prints a list of derived rules, with the substring used to derive each rule. Here's an example from CALL-SLT/French:&lt;/div&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&gt;&gt; EBL_MODE&lt;br /&gt;(Do EBL processing on input sentences)&lt;br /&gt;&lt;br /&gt;--- Performed command EBL_MODE, time = 0.00 seconds&lt;br /&gt;&lt;br /&gt;&gt;&gt; puis je avoir un sandwich&lt;br /&gt;&lt;br /&gt;Taking operationality criteria from d:/cygwin/home/speech/call-slt/fre/prolog/operationality_recognition.pl&lt;br /&gt;&lt;br /&gt;--- Written compiled operationality file (37 items) d:/cygwin/home/speech/call-slt/fre/generatedfiles/fre_recognition_tmp_ebl_operational.pl&lt;br /&gt;% compiling d:/cygwin/home/speech/call-slt/fre/generatedfiles/fre_recognition_tmp_ebl_operational.pl...&lt;br /&gt;%  module tmp_ebl_operational imported into user&lt;br /&gt;%  module lists imported into tmp_ebl_operational&lt;br /&gt;%  module utilities imported into tmp_ebl_operational&lt;br /&gt;%  module ebl_operational imported into tmp_ebl_operational&lt;br /&gt;% compiled d:/cygwin/home/speech/call-slt/fre/generatedfiles/fre_recognition_tmp_ebl_operational.pl in module tmp_ebl_operational, 0 msec 6928 bytes&lt;br /&gt;&lt;br /&gt;Rule of form ".MAIN--&gt;utterance"&lt;br /&gt;derived from [puis,je,avoir,un,sandwich]&lt;br /&gt;&lt;br /&gt;Rule of form "utterance--&gt;med_utterance"&lt;br /&gt;derived from [puis,je,avoir,un,sandwich]&lt;br /&gt;&lt;br /&gt;Rule of form "med_utterance--&gt;vp"&lt;br /&gt;derived from [puis,je,avoir,un,sandwich]&lt;br /&gt;&lt;br /&gt;Rule of form "vp--&gt;vbar,vbar,np,optional_pp"&lt;br /&gt;derived from [puis,je,avoir,un,sandwich]&lt;br /&gt;&lt;br /&gt;Rule of form "vbar--&gt;verb,hyphen,pronoun,optional_adverb"&lt;br /&gt;derived from [puis,je]&lt;br /&gt;&lt;br /&gt;Rule of form "verb--&gt;puis"&lt;br /&gt;derived from [puis]&lt;br /&gt;&lt;br /&gt;Rule of form "hyphen--&gt;[]"&lt;br /&gt;derived from []&lt;br /&gt;&lt;br /&gt;Rule of form "pronoun--&gt;je"&lt;br /&gt;derived from [je]&lt;br /&gt;&lt;br /&gt;Rule of form "optional_adverb--&gt;[]"&lt;br /&gt;derived from []&lt;br /&gt;&lt;br /&gt;Rule of form "vbar--&gt;verb,optional_adverb"&lt;br /&gt;derived from [avoir]&lt;br /&gt;&lt;br /&gt;Rule of form "verb--&gt;avoir"&lt;br /&gt;derived from [avoir]&lt;br /&gt;&lt;br /&gt;Rule of form "optional_adverb--&gt;[]"&lt;br /&gt;derived from []&lt;br /&gt;&lt;br /&gt;Rule of form "np--&gt;spec,n"&lt;br /&gt;derived from [un,sandwich]&lt;br /&gt;&lt;br /&gt;Rule of form "spec--&gt;un"&lt;br /&gt;derived from [un]&lt;br /&gt;&lt;br /&gt;Rule of form "n--&gt;sandwich"&lt;br /&gt;derived from [sandwich]&lt;br /&gt;&lt;br /&gt;Rule of form "optional_pp--&gt;[]"&lt;br /&gt;derived from []&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-8982863378814462553?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/8982863378814462553/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=8982863378814462553' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/8982863378814462553'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/8982863378814462553'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2010/08/ebl-mode.html' title='EBL mode'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-4282385277256947059</id><published>2010-04-13T14:27:00.000-07:00</published><updated>2010-04-13T14:34:15.959-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Nuance-9'/><title type='text'>Regulus with Nuance 9, continued</title><content type='html'>Success, of sorts! I have a first version of the Nuance 9/server integration checked in, and Matthew tells me it runs correctly on his machine. (I don't yet have a full Nuance 9 installed here). Everything appears to do what it's supposed to: the original grammar is compiled into a Nuance 9 GrXML grammar, with the semantics transformed into string-concatenation semantics that put together a string representation of the semantic form. This is passed from the MRCP process to the dialogue server, which unpacks the strings, reconstructs the real semantic forms, and then passes them to downstream processing.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We were expecting that we'd save a non-trivial amount of time by skipping the parsing phrase in the server... but Matthew tells me that, as far as he can tell, it's slightly &lt;i&gt;slower&lt;/i&gt; than when we were running the recogniser without semantics, and creating the semantic forms on the server side! I don't understand this at all. Will run some offline tests tomorrow and see if I can spot any obvious time-sink.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-4282385277256947059?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/4282385277256947059/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=4282385277256947059' title='26 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/4282385277256947059'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/4282385277256947059'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2010/04/regulus-with-nuance-9-continued.html' title='Regulus with Nuance 9, continued'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>26</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-5278560374779147491</id><published>2010-04-12T12:48:00.001-07:00</published><updated>2010-04-12T14:02:05.157-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Nuance-9'/><title type='text'>Regulus with Nuance 9</title><content type='html'>The last couple of weeks, I've been working with Matthew Fuchs (paideia.com) on getting Regulus to work with Nuance 9. We've had a bunch of problems, but we're making good progress and are nearly at the point of having things up and running.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;When we started, we knew about two things right off. Nuance 9 doesn't support recursive grammars, and it doesn't allow GSL either; instead, it requires GrXML, where the semantics are done using ECMAScript (JavaScript). Our basic strategy was to carry on generating Nuance 8.5 GSL, and rely on the Nuance conversion tool, which is supposed to be able to convert GSL into GrXML. We also had to make sure that the grammars were non-recursive.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I added code to check for recursivity, and it turns out to be easy enough, at least in the cases we've looked at so far, to fix the operationality criteria in grammar specialisation so that the generated grammars are non-recursive. It wasn't so easy, though, to use the 8.5 to 9 conversion tool, since it turned out that it didn't handle the 'concat' operator, completely central to Regulus semantics.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We wondered for a while if we'd either have to give up on using semantics in Nuance 9, and parse everything in Regulus, or else have Regulus directly generate Nuance 9 grammars - possible, but non-trivial. But I thought of a cute work-around over the weekend, which seems to solve the problem for now. Instead of generating the actual semantics, we generate &lt;i&gt;strings &lt;/i&gt;which encode the semantics, and put them together with 'strcat' rather than 'concat' - the strcat operator &lt;i&gt;is &lt;/i&gt;handled by the conversion tool. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Another reason why we were reluctant to generate Nuance 9 semantics directly is that, as far as we can make out, Nuance 9 doesn't have a tool for doing PCFG tuning, which is essential to good performance. But, with the current scheme, we can generate Nuance 8.5 grammars, do the PCFG tuning in 8.5, and then translate into 9. The conversion tool correctly carries across the generated probabilistic weights.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It's a bit of a Frankenstein's monster, but it does all seem to work! Matthew just told me that he was able to run the Nuance 9 grammar successfully in MRCP. Now we just need to integrate everything with the dialogue server, and we'll have the first version fully running. More soon, I hope...&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-5278560374779147491?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/5278560374779147491/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=5278560374779147491' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/5278560374779147491'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/5278560374779147491'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2010/04/regulus-with-nuance-9.html' title='Regulus with Nuance 9'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-2456508982827491708</id><published>2010-02-07T07:25:00.000-08:00</published><updated>2010-02-07T07:32:54.884-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='CALL-SLT'/><category scheme='http://www.blogger.com/atom/ns#' term='Japanese'/><title type='text'>Screenshots from Japanese CALL-SLT</title><content type='html'>&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color: rgb(0, 0, 238); -webkit-text-decorations-in-effect: underline; "&gt;&lt;img src="http://4.bp.blogspot.com/_1zop44wiXgM/S27buy6AtTI/AAAAAAAAAAM/f1BfIqwhLWQ/s400/Jap1.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5435523397268321586" style="display: block; margin-top: 0px; margin-right: auto; margin-bottom: 10px; margin-left: auto; text-align: center; cursor: pointer; width: 400px; height: 232px; " /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color: rgb(0, 0, 238); -webkit-text-decorations-in-effect: underline; "&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color: rgb(0, 0, 238); -webkit-text-decorations-in-effect: underline; "&gt;&lt;span class="Apple-style-span" style="color: rgb(0, 0, 0); "&gt;The system shows a French prompt, "DEMANDER DE_MANIERE_POLIE EAU"&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_1zop44wiXgM/S27cICVrRzI/AAAAAAAAAAc/64gpI0wF7J4/s1600-h/Jap2.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 232px;" src="http://4.bp.blogspot.com/_1zop44wiXgM/S27cICVrRzI/AAAAAAAAAAc/64gpI0wF7J4/s400/Jap2.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5435523830907619122" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;div&gt;The student speaks a corresponding sentence in Japanese, here "omizu onegai shimasu". The system does speech recognition and understanding, and shows the recognised sentence to the student in both Japanese and Roman orthography.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_1zop44wiXgM/S27cCJkUIVI/AAAAAAAAAAU/OCpTvzq4dd8/s1600-h/Jap3.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 232px;" src="http://1.bp.blogspot.com/_1zop44wiXgM/S27cCJkUIVI/AAAAAAAAAAU/OCpTvzq4dd8/s400/Jap3.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5435523729768849746" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If the students wishes, they can click the help button, to get an example of a Japanese native speaker saying something that will work.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_1zop44wiXgM/S27buy6AtTI/AAAAAAAAAAM/f1BfIqwhLWQ/s1600-h/Jap1.png"&gt;&lt;br /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-2456508982827491708?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/2456508982827491708/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=2456508982827491708' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/2456508982827491708'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/2456508982827491708'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2010/02/screenshots-from-japanese-call-slt.html' title='Screenshots from Japanese CALL-SLT'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_1zop44wiXgM/S27buy6AtTI/AAAAAAAAAAM/f1BfIqwhLWQ/s72-c/Jap1.png' height='72' width='72'/><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-2983120464398336697</id><published>2010-01-13T07:47:00.000-08:00</published><updated>2010-01-13T07:56:03.953-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='grammar'/><category scheme='http://www.blogger.com/atom/ns#' term='Scandinavian'/><category scheme='http://www.blogger.com/atom/ns#' term='MedSLT'/><title type='text'>A Scandinavian/English grammar, continued</title><content type='html'>More progress on the Scandinavian/English grammar:&lt;div&gt;&lt;ul&gt;&lt;li&gt;I've now got quite reasonable initial rules for negation and lexically passivized verbs, two of the largest holes in the Swedish.  Negation, in Swedish, is just an adverb. I did however have to add a feature which distinguished main clauses from subordinate clauses, since the negation adverb occurs after the verb in a main clause, and before it in a subordinate clause. It turned out to be easy to adapt the existing rules for passives to handle lexical passives as well: the grammar now allows passive versions of the present, imperfect, supine and infinitive forms. Thus for example &lt;i&gt;den kan inte köpas här&lt;/i&gt;, "it can not buy-INF-PASSIVE here" = "it can not be bought here".&lt;/li&gt;&lt;li&gt;I've improved the coverage in the Swedish version of MedSLT. I can now translate 92% of the combined interlingua corpus into Swedish, and translate back 99% of the results. This mostly involved adding new lexical items and transfer rules, though I also had to make a couple of minor adjustments to the grammar.&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-2983120464398336697?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/2983120464398336697/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=2983120464398336697' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/2983120464398336697'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/2983120464398336697'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2010/01/scandinavianenglish-grammar-continued.html' title='A Scandinavian/English grammar, continued'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-6792989539257239657</id><published>2010-01-08T04:10:00.000-08:00</published><updated>2010-01-08T04:14:05.627-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Scandinavian'/><category scheme='http://www.blogger.com/atom/ns#' term='MedSLT'/><category scheme='http://www.blogger.com/atom/ns#' term='English'/><title type='text'>Scandinavian/English grammar now used for English MedSLT</title><content type='html'>I've now changed the config files for English MedSLT so that they use the shared Scandinavian/English grammar rather than old English-only grammar. I found a couple of small bugs, but now everything seems to be working fine again.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Some time soon, Nikos and I should get together and figure out how to add Swedish to the MedSLT demo and the nightly build. It shouldn't be at all hard.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-6792989539257239657?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/6792989539257239657/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=6792989539257239657' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/6792989539257239657'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/6792989539257239657'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2010/01/scandinavianenglish-grammar-now-used.html' title='Scandinavian/English grammar now used for English MedSLT'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-2395695708797014615</id><published>2010-01-02T20:14:00.000-08:00</published><updated>2010-01-02T20:24:32.451-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='grammar'/><category scheme='http://www.blogger.com/atom/ns#' term='Swedish'/><category scheme='http://www.blogger.com/atom/ns#' term='Scandinavian'/><category scheme='http://www.blogger.com/atom/ns#' term='MedSLT'/><title type='text'>Swedish MedSLT, continued</title><content type='html'>I temporarily broke off working on Interlingua to Swedish, and spent a day concentrating on the opposite direction. I built the recognition grammar by training on a corpus which was the union of the original recognition corpus and the generation corpus (this ensures that everything you can generate will also get recognized); then I did PCFG tuning using the set of translations produced from the combined Interlingua corpus. I also used the set of translations as the initial Swedish corpus for translation testing. All the corpora concerned are created on-the-fly as part of the make process, so the correspondences will stay up to date. &lt;div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It was easy to get things working in Swe -&gt; Int direction, and 98% of the translation corpus now produces well-formed interlingua. I compiled a Swedish recognizer, and hooked everything together to get a speech-to-speech system for Swedish -&gt; English. Anecdotally, it's not bad.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The most urgent thing now is probably to add more Swedish coverage. There are several very common constructions that currently aren't in the specialized Swedish grammar.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-2395695708797014615?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/2395695708797014615/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=2395695708797014615' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/2395695708797014615'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/2395695708797014615'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2010/01/swedish-medslt-continued.html' title='Swedish MedSLT, continued'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-6761494000417045588</id><published>2009-12-30T10:06:00.000-08:00</published><updated>2009-12-30T10:12:43.557-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='grammar'/><category scheme='http://www.blogger.com/atom/ns#' term='Swedish'/><category scheme='http://www.blogger.com/atom/ns#' term='Scandinavian'/><category scheme='http://www.blogger.com/atom/ns#' term='MedSLT'/><title type='text'>Swedish MedSLT, continued</title><content type='html'>A few more days of messing around, and I'm now translating about 80% of the 1200-item combined interlingua corpus into Swedish. Elisabeth (a native speaker) looked at about half of the material, and made some suggestions in the direction of improving quality. After implementing them, she thinks that over 90% of the translations are clearly good. To do this work, I have had to make a few more improvements in the combined Scandinavian/English grammar: the most important of these is an initial treatment of lexically reflexive verbs.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I'll see if I can improve the numbers a bit more, and will then start on the Swedish-to-Interlingua direction. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-6761494000417045588?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/6761494000417045588/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=6761494000417045588' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/6761494000417045588'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/6761494000417045588'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/12/swedish-medslt-continued_30.html' title='Swedish MedSLT, continued'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-5230982616952871928</id><published>2009-12-25T15:02:00.001-08:00</published><updated>2009-12-26T00:34:53.467-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Swedish'/><category scheme='http://www.blogger.com/atom/ns#' term='Scandinavian'/><category scheme='http://www.blogger.com/atom/ns#' term='MedSLT'/><title type='text'>Swedish MedSLT, continued</title><content type='html'>Encouraging progress on Swedish MedSLT: I can now translate a third of the interlingua corpus into Swedish. The translations aren't very good yet, but they are nearly all grammatical. I think it will be fairly easy to improve things.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-5230982616952871928?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/5230982616952871928/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=5230982616952871928' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/5230982616952871928'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/5230982616952871928'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/12/swedish-medslt-continued.html' title='Swedish MedSLT, continued'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-7094992690918237924</id><published>2009-12-22T06:47:00.000-08:00</published><updated>2009-12-22T06:54:39.452-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='grammar'/><category scheme='http://www.blogger.com/atom/ns#' term='Swedish'/><category scheme='http://www.blogger.com/atom/ns#' term='Scandinavian'/><category scheme='http://www.blogger.com/atom/ns#' term='MedSLT'/><title type='text'>Swedish MedSLT</title><content type='html'>I have just started on a Swedish version of MedSLT; this will give the new Scandinavian/English grammar a much more thorough workout. After an hour or two of messing around, I can parse one sentence, &lt;i&gt;var har du ont &lt;/i&gt;("where is your pain", literally "where have you pain"). &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The first thing I notice is that we need rules for impersonal and lexically reflexive verbs, which are very important in this domain. In particular, an expression that occurs all the time is &lt;i&gt;det gör ont&lt;/i&gt;; "it hurts", literally "it makes pain".&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;More soon.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-7094992690918237924?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/7094992690918237924/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=7094992690918237924' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/7094992690918237924'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/7094992690918237924'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/12/swedish-medslt.html' title='Swedish MedSLT'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-8284232384729061127</id><published>2009-12-19T06:46:00.000-08:00</published><updated>2009-12-19T06:48:49.553-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='CALL-SLT'/><category scheme='http://www.blogger.com/atom/ns#' term='grammar'/><category scheme='http://www.blogger.com/atom/ns#' term='Scandinavian'/><title type='text'>A Scandinavian/English grammar, continued</title><content type='html'>I've now changed the config files for English CALL-SLT so that they use the new Scandinavian grammar instead of the English-only one that it's based on. This will give us more of a chance to test how it works.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Next, I will make similar changes in the English part of MedSLT.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-8284232384729061127?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/8284232384729061127/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=8284232384729061127' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/8284232384729061127'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/8284232384729061127'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/12/scandinavianenglish-grammar-continued_19.html' title='A Scandinavian/English grammar, continued'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-1637999350640505622</id><published>2009-12-18T01:02:00.000-08:00</published><updated>2009-12-18T01:13:00.053-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='grammar'/><category scheme='http://www.blogger.com/atom/ns#' term='Scandinavian'/><title type='text'>A Scandinavian/English grammar, continued</title><content type='html'>The Scandinavian grammar is working quite well, but it's certainly not complete yet. Here are some important things that still need to be added, all of which occur in Scandinavian but not in English:&lt;div&gt;&lt;ul&gt;&lt;li&gt;Negation. This is just an adverb in Scandinavian. The slightly non-trivial thing is that the position of this adverb (also some others) is different, depending on whether it's a main or a subordinate clause.&lt;/li&gt;&lt;li&gt;Lexically reflexive pronouns. As in Romance, some Scandinavian verbs subcategorize for lexically reflexive pronouns, which have no semantic value.&lt;/li&gt;&lt;li&gt;Lexical passive. Scandinavian verbs have a lexical passive form. I propose to do this in the morphotax.&lt;/li&gt;&lt;li&gt;Definiteness. I don't yet have all the definiteness constraints. In particular, a definite singular NP has an implicit definite article, but a premodifying adjective requires an explicit definite article.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;I don't think any of these things are particularly difficult to implement, or should require major changes to the grammar.&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-1637999350640505622?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/1637999350640505622/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=1637999350640505622' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/1637999350640505622'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/1637999350640505622'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/12/scandinavianenglish-grammar-continued_18.html' title='A Scandinavian/English grammar, continued'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-7059299654349687304</id><published>2009-12-16T07:03:00.001-08:00</published><updated>2009-12-16T09:25:15.919-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='CALL-SLT'/><category scheme='http://www.blogger.com/atom/ns#' term='Swedish'/><category scheme='http://www.blogger.com/atom/ns#' term='Scandinavian'/><title type='text'>Swedish CALL-SLT</title><content type='html'>I must stop messing around with the Scandinavian grammar... it's really too much fun! Anyway, I now have a reasonable first cut at a Swedish version of CALL-SLT, working as usual in the restaurant domain. Elisabeth helped me add more material to the Swedish corpus last night; currently, it contains about 160 entries, of which about 90% work. I just tried out the live system, using Maria's GUI, and it runs fine. As I'd hoped, recognition picked up a good deal once I was able to switch on N-best rescoring. I'm getting performance in Swedish only slightly inferior to what I get in English, which seems reasonable given my relative abilities in the two languages.&lt;div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If people want to try it out, you need to update both Regulus and CALL-SLT using the -d flag, and do a make in CALLSLT/Swe/scripts. Then run in the usual way. So far there is no spoken help, but Elisabeth has promised to record files over Christmas.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt; &lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-7059299654349687304?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/7059299654349687304/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=7059299654349687304' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/7059299654349687304'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/7059299654349687304'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/12/swedish-call-slt.html' title='Swedish CALL-SLT'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-1692176913886045166</id><published>2009-12-15T02:38:00.000-08:00</published><updated>2009-12-15T02:44:19.804-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='grammar'/><category scheme='http://www.blogger.com/atom/ns#' term='Swedish'/><category scheme='http://www.blogger.com/atom/ns#' term='Scandinavian'/><title type='text'>A Scandinavian/English grammar, continued</title><content type='html'>A bit more fiddling around, and I have two-thirds of the initial Swedish CALL-SLT corpus parsing. You can see it &lt;a href="http://callslt.cvs.sourceforge.net/viewvc/callslt/CALL-SLT/Swe/corpora/callslt_sents_combined_domains.pl"&gt;here&lt;/a&gt;. I've also compiled an initial recogniser. So far, it doesn't work very well, but if it's like the other languages it will improve considerably once I add N-best rescoring.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This is all progressing rather nicely! I need to do some work now on processing the results of our recent CALL-SLT experiments, but once I've done that I'll return to the Swedish. Elisabeth says she will act as our native speaker.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-1692176913886045166?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/1692176913886045166/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=1692176913886045166' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/1692176913886045166'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/1692176913886045166'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/12/scandinavianenglish-grammar-continued.html' title='A Scandinavian/English grammar, continued'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-8413052108913633408</id><published>2009-12-14T01:07:00.001-08:00</published><updated>2009-12-14T01:56:59.887-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='CALL-SLT'/><category scheme='http://www.blogger.com/atom/ns#' term='grammar'/><category scheme='http://www.blogger.com/atom/ns#' term='Swedish'/><category scheme='http://www.blogger.com/atom/ns#' term='Scandinavian'/><category scheme='http://www.blogger.com/atom/ns#' term='English'/><title type='text'>A Scandinavian/English grammar</title><content type='html'>This weekend, I finally started on a task that I've been meaning to do for ages, and put together a first version of a shared grammar that is intended to cover both English and the Scandinavian languages Swedish, Norwegian and Danish. I've checked it into Regulus/Grammar/Scandinavian. Initially, I'm developing using the CALL-SLT restaurant domain, in English and Swedish. The Swedish version can already handle a reasonable range of language. Here's an example: &lt;i&gt;skulle jag kunna få en pizza &lt;/i&gt;= would I be-able-to get a pizza. This is a common way to ask for something in Swedish.&lt;pre&gt;&gt;&gt; skulle jag kunna få en pizza&lt;br /&gt;(Parsing with left-corner parser)&lt;br /&gt;&lt;br /&gt;Analysis time: 0.34 seconds&lt;br /&gt;&lt;br /&gt;3 possibilities:&lt;br /&gt;&lt;br /&gt;----------------------------------------------------------------&lt;br /&gt;Possibility 1&lt;br /&gt;Return value: [(null=[action,få]), (object=[food,pizza]), (null=[modal,kan]),&lt;br /&gt;         (null=[modal,skulle]), (agent=[pronoun,jag]), (null=[utterance_type,ynq]),&lt;br /&gt;         (null=[voice,active])]&lt;br /&gt;&lt;br /&gt;Global value: []&lt;br /&gt;&lt;br /&gt;Syn features: []&lt;br /&gt;&lt;br /&gt;Parse tree:&lt;br /&gt;&lt;br /&gt;.MAIN [GENERAL_SCA:541-546]&lt;br /&gt;top [GENERAL_SCA:552-558]&lt;br /&gt;/  utterance_intro null [GENERAL_SCA:566-568]&lt;br /&gt;|  utterance [GENERAL_SCA:615-620]&lt;br /&gt;|     s [GENERAL_SCA:713-718]&lt;br /&gt;|        s [GENERAL_SCA:817-826]&lt;br /&gt;|           vp [GENERAL_SCA:1124-1137]&lt;br /&gt;|           /  vbar [GENERAL_SCA:876-898]&lt;br /&gt;|           |  /  v lex(skulle) [GEN_SWE_LEX:51-51]&lt;br /&gt;|           |  |  np [GENERAL_SCA:1952-1960]&lt;br /&gt;|           |  \     pronoun lex(jag) [GEN_SWE_LEX:200-200]&lt;br /&gt;|           |  vp [GENERAL_SCA:1124-1137]&lt;br /&gt;|           |  /  vbar [GENERAL_SCA:853-875]&lt;br /&gt;|           |  |     v lex(kunna) [GEN_SWE_LEX:61-62]&lt;br /&gt;|           |  |  vp [GENERAL_SCA:1317-1337]&lt;br /&gt;|           |  |  /  vp [GENERAL_SCA:1042-1051]&lt;br /&gt;|           |  |  |  /  vbar [GENERAL_SCA:853-875]&lt;br /&gt;|           |  |  |  |     v lex(få) [CALLSLT_LEX:33-33]&lt;br /&gt;|           |  |  |  |  np [GENERAL_SCA:2073-2091]&lt;br /&gt;|           |  |  |  |  /  np [GENERAL_SCA:1907-1917]&lt;br /&gt;|           |  |  |  |  |  /  d lex(en) [GEN_SWE_LEX:277-278]&lt;br /&gt;|           |  |  |  |  |  |  nbar [GENERAL_SCA:2118-2130]&lt;br /&gt;|           |  |  |  |  |  \     n lex(pizza) [CALLSLT_LEX:179-181]&lt;br /&gt;|           |  |  |  \  \  post_mods null [GENERAL_SCA:1451-1457]&lt;br /&gt;|           \  \  \  post_mods null [GENERAL_SCA:1451-1457]&lt;br /&gt;\  utterance_coda null [GENERAL_SCA:597-599]&lt;br /&gt;&lt;br /&gt;------------------------------- FILES -------------------------------&lt;br /&gt;&lt;br /&gt;CALLSLT_LEX: d:/cygwin/home/speech/call-slt/swe/regulus/callslt_lex.regulus&lt;br /&gt;GENERAL_SCA: d:/cygwin/home/speech/regulus/grammar/scandinavian/general_sca.regulus&lt;br /&gt;GEN_SWE_LEX: d:/cygwin/home/speech/regulus/grammar/scandinavian/swedish/gen_swe_lex.regulus&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Here are some of the issues I've encountered so far:&lt;br /&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;Agreement is different in English and Swedish, so the range of possible values for the 'agr' feature has to be language-dependent. In English, agreement is by person and number. In Swedish, it's primarily by number and gender ("common" or "neuter"). However, you also need person, since reflexive pronouns agree with the subject in person.&lt;/li&gt;&lt;li&gt;All verbs invert in Swedish, and there is no auxiliary "do". &lt;/li&gt;&lt;li&gt;The range of verb inflections is different in the two languages. English verbs have five forms: base, third person singular present, imperfect, past participle, present participle. So, for example, go, goes, went, gone, going. Modern Swedish doesn't inflect by person or number in the present tense, but on the other hand distinguishes the imperative from the infinitive, distinguishes the "supine" (the form used for the perfect tense) from the past participle, and inflects the past participle by number and gender. So, for example&lt;i&gt;, bryta &lt;/i&gt;(break) has the forms &lt;i&gt;bryt &lt;/i&gt;(imperative), &lt;i&gt;bryta &lt;/i&gt;(infinitive), &lt;i&gt;bryter &lt;/i&gt;(present), &lt;i&gt;bröt&lt;/i&gt; (imperfect), &lt;i&gt;brutit &lt;/i&gt;(supine), &lt;i&gt;brytande &lt;/i&gt;(present participle), &lt;i&gt;bruten &lt;/i&gt;(past participle singular common), &lt;i&gt;brutet &lt;/i&gt;(past participle singular neuter), &lt;i&gt;brutna&lt;/i&gt; (past participle plural).&lt;/li&gt;&lt;li&gt;Swedish nouns inflect for definiteness. So for example &lt;i&gt;bord &lt;/i&gt;is "table", but &lt;i&gt;bordet &lt;/i&gt;is "the table". Adjectives also inflect for definiteness, thus &lt;i&gt;ett stort bord&lt;/i&gt; ("a big table") but &lt;i&gt;det stora bordet&lt;/i&gt; ("the big table").&lt;/li&gt;&lt;li&gt;Swedish possessives inflect for gender and number. So &lt;i&gt;min bil&lt;/i&gt; ("my car", common/singular), &lt;i&gt;mitt hus&lt;/i&gt; ("my house", neuter/singular), &lt;i&gt;mina barn&lt;/i&gt; ("my children", plural).&lt;/li&gt;&lt;li&gt;Swedish partitives are slightly different. In English, "a bottle &lt;b&gt;of &lt;/b&gt;beer"; in Swedish &lt;i&gt;en flaska öl &lt;/i&gt;(nothing corresponding to the "of").&lt;/li&gt;&lt;li&gt;Swedish date and time grammar is slightly different. In English, "december fourteenth"; in Swedish, &lt;i&gt;fjortonde december&lt;/i&gt;. In English, "nine thirty"; in Swedish, &lt;i&gt;nio och trettio&lt;/i&gt;.&lt;/li&gt;&lt;li&gt;Swedish negation is basically an adverb, e.g. &lt;i&gt;jag beställde &lt;b&gt;inte &lt;/b&gt;någon pizza&lt;/i&gt; = I ordered &lt;b&gt;not&lt;/b&gt; any pizza. The negation adverb's position is after the verb in main clauses, before in subordinate clauses.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;Most of this stuff seems easy, and I can adapt the treatments we implemented in the SLT grammar during the 90s. I am guessing that 85-90% of the rules in the final shared grammar will be common to English and Swedish. Judging from our experiences with SLT, Danish and Swedish overlap to 95% or better, and Norwegian should be similar.&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-8413052108913633408?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/8413052108913633408/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=8413052108913633408' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/8413052108913633408'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/8413052108913633408'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/12/scandinavianenglish-grammar.html' title='A Scandinavian/English grammar'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-5307172056066064036</id><published>2009-09-23T13:34:00.000-07:00</published><updated>2009-09-23T13:50:49.709-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Calendar'/><category scheme='http://www.blogger.com/atom/ns#' term='Japanese'/><title type='text'>New treatment of Japanese verbs</title><content type='html'>I've been discussing Japanese verbs with Yukie - we need new inflectional forms for CALL-SLT, in the particular the volitional (-tai) form, and the old system was getting out of hand. Japanese has extraordinarily regular morphology, with only two irregular verbs and some very straightforward sound-changes, so it seemed to me that we really ought to be able to get by without explicitly listing all the inflections of every verb we needed.&lt;br /&gt;&lt;br /&gt;Yukie wrote down a table of inflections, and we discussed ways of splitting up inflected verbs into stems and affixes. Based on our discussion, I've implemented a first version of a new treatment, where you now only need to specify a single root form of the verb, and everything else is done by morphotax rules, where the affixes are treated by Nuance as separate words. I've tested by converting the Japanese Calendar lexicon to the new form, and compiling into a recognizer. Coverage is what it was, and recognition is anecdotally fine with my voice. I will dig out some Japanese Calendar data soon and run proper tests.&lt;br /&gt;&lt;br /&gt;If anyone wants to look at the details, the morphotax rules are in $REGULUS/Grammar/Japanese/japanese_verb_morphology.regulus. The new version of the Japanese Calendar lexicon is at $REGULUS/Examples/Calendar/Regulus/japanese_calendar_lex_new.regulus.&lt;br /&gt;&lt;br /&gt;Here's an example of a parse:&lt;br /&gt;&lt;pre&gt;&lt;span class="quote"&gt;&lt;span style="font-family: monospace;"&gt;&lt;br /&gt;&lt;/span&gt;$ nanji ni owa ri mashita ka&lt;/span&gt;&lt;br /&gt;&lt;pre class="bz_comment_text" id="comment_text_17"&gt;(Parsing with left-corner parser)&lt;br /&gt;&lt;br /&gt;Analysis time: 0.09 seconds&lt;br /&gt;&lt;br /&gt;Return value: [[question,form(past,[[owaru],[ni,term(null,nanji,[])]])]]&lt;br /&gt;&lt;br /&gt;Global value: []&lt;br /&gt;&lt;br /&gt;Syn features: []&lt;br /&gt;&lt;br /&gt;Parse tree:&lt;br /&gt;&lt;br /&gt;.MAIN [JAPANESE_CORE_RULES:112-116]&lt;br /&gt;  top [JAPANESE_CORE_RULES:117-119]&lt;br /&gt;     utterance [JAPANESE_CORE_RULES:120-123]&lt;br /&gt;     /  main_clause [JAPANESE_CORE_RULES:147-151]&lt;br /&gt;     |     s [JAPANESE_CORE_RULES:155-162]&lt;br /&gt;     |     /  comps [JAPANESE_CORE_RULES:190-195]&lt;br /&gt;     |     |  /  pp [JAPANESE_CORE_RULES:414-423]&lt;br /&gt;     |     |  |  /  np [JAPANESE_CORE_RULES:267-273]&lt;br /&gt;     |     |  |  |     n lex(nanji) [JAPANESE_CALENDAR_LEX_NEW:84-84]&lt;br /&gt;     |     |  |  \  p lex(ni) [JAPANESE_CALENDAR_LEX_NEW:274-284]&lt;br /&gt;     |     |  \  comps null [JAPANESE_CORE_RULES:163-166]&lt;br /&gt;     |     |  vbar [JAPANESE_CORE_RULES:249-253]&lt;br /&gt;     |     |     v [JAPANESE_VERB_MORPHOLOGY:13-26]&lt;br /&gt;     |     |     /  v_stem [JAPANESE_VERB_MORPHOLOGY:27-38]&lt;br /&gt;     |     |     |  /  v_stem lex(owa) [JAPANESE_CALENDAR_LEX_NEW:227-237]&lt;br /&gt;     |     |     |  \  stem_affix lex(ri) [JAPANESE_VERB_MORPHOLOGY:138-138]&lt;br /&gt;     |     \     \  affix lex(mashita) [JAPANESE_VERB_MORPHOLOGY:80-83]&lt;br /&gt;     \  lex(ka)&lt;br /&gt;&lt;br /&gt;------------------------------- FILES -------------------------------&lt;br /&gt;&lt;br /&gt;JAPANESE_CALENDAR_LEX_NEW:&lt;br /&gt;d:/cygwin/home/speech/regulus/examples/calendar/regulus/japanese_calendar_lex_new.regulus&lt;br /&gt;JAPANESE_CORE_RULES:     &lt;br /&gt;d:/cygwin/home/speech/regulus/grammar/japanese/japanese_core_rules.regulus&lt;br /&gt;JAPANESE_VERB_MORPHOLOGY:&lt;br /&gt;d:/cygwin/home/speech/regulus/grammar/japanese/japanese_verb_morphology.regulus&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-5307172056066064036?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/5307172056066064036/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=5307172056066064036' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/5307172056066064036'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/5307172056066064036'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/09/new-treatment-of-japanese-verbs.html' title='New treatment of Japanese verbs'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-659139882570001597</id><published>2009-09-17T00:06:00.000-07:00</published><updated>2009-09-17T02:40:48.087-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='CALL-SLT'/><category scheme='http://www.blogger.com/atom/ns#' term='dialogue-server'/><title type='text'>"Abstract actions" and the dialogue server</title><content type='html'>I've had some productive discussions with Maria over the last few days, which have resulted in a couple of significant improvements to the dialogue server. Maria is going to build a Java GUI for the CALL-SLT system. She needs to be able to send requests to the dialogue server, and get back information that she will pass on to the user. Most often this will be in the form of screen-based output. The new functionality is motivated by this scenario, but is quite generic.&lt;br /&gt;&lt;br /&gt;The first point Maria made was that she would prefer to use XML-formatted messages. Java finds it easy to manipulate XML; parsing Prolog messages, on the other hand, is a complete pain. So I added switches that allow the client to put the server into a mode where all messages are XML strings inside a minimal Prolog wrapper.&lt;br /&gt;&lt;br /&gt;Yesterday, Maria made another very sensible request. In the first version of the application, the Prolog "output manager" module received abstract actions, and transformed them into concrete actions. Typically, concrete actions would involve printing strings. So, for example, suppose that the system has just given you the prompt&lt;br /&gt;&lt;br /&gt; POLITE REQUEST TABLE outside&lt;br /&gt;&lt;br /&gt;and you have correctly replied&lt;br /&gt;&lt;br /&gt; i would like a table outside please&lt;br /&gt;&lt;br /&gt;The abstract action produced is&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;display_matching_info('i would like a table outside please', correct, [2,1,2])&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;which means that the recognized words were 'i would like a table outside please'; they correctly matched the prompt; and the score is now 2 correct, 1 incorrect, with a positive streak of 2. This is converted by the output manager into the concrete action&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;print('I heard: "i would like a table outside please"&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;Correct!&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;Score: 2 right, 1 wrong (66.7%) Streak: 2')&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;This works fine for a text-based command-line interface; but, as Maria pointed out, the output processing isn't necessarily appropriate when you have a Java Swing GUI, in which case you probably would prefer to format it yourself. For instance, you might want to print the recognized words in one pane, render the "correct" as a green tick-mark in another one, and present the score graphically as three columns of different heights.&lt;br /&gt;&lt;br /&gt;In general, the abstract action is going to be more useful to you than the concrete one. So I've just added a little more functionality to the dialogue server to handle that too. Here's a summary of the new messages, and what they do; they are also documented in the file itself, $REGULUS/Prolog/dialogue_server.pl.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:courier new;"&gt;action(xml_messages).&lt;/span&gt; Format future messages in both directions in XML form. Each message will be of the form&lt;br /&gt;&lt;br /&gt;    &lt;span style="font-family:courier new;"&gt;xml_message(XMLString).&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;where XMLString is an XML encoding of the corresponding Prolog message produced using the predicate &lt;span style="font-family:courier new;"&gt;prolog_xml/2&lt;/span&gt; in $REGULUS/PrologLib/prolog_xml.pl. The XML can be converted back into Prolog if necessary using the same predicate.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:courier new;"&gt;action(prolog_messages).&lt;/span&gt; Format future messages in both directions in Prolog form (default).&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:courier new;"&gt;action(abstract_actions).&lt;/span&gt; Pass abstract actions to the client, so that the client can do its own output management.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:courier new;"&gt;action(concrete_actions).&lt;/span&gt; Pass concrete actions to the client (default).&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-659139882570001597?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/659139882570001597/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=659139882570001597' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/659139882570001597'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/659139882570001597'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/09/abstract-actions-and-dialogue-server.html' title='&quot;Abstract actions&quot; and the dialogue server'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-4371198771600002033</id><published>2009-09-15T09:54:00.000-07:00</published><updated>2009-09-15T09:57:50.226-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sicstus'/><title type='text'>Warning: don't use SICStus 4.0.2</title><content type='html'>Maria and I have just spent two very frustrating days trying to figure out why CALL-SLT wasn't running correctly on her machine. In the end, it turned out that a few bits of Regulus functionality don't work correctly under SICStus 4.0.2, which is the version she was using... there appears to be something wrong with the SICStus/operating system interface.&lt;br /&gt;&lt;br /&gt;So avoid this release! 4.0.4 is fine.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-4371198771600002033?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/4371198771600002033/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=4371198771600002033' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/4371198771600002033'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/4371198771600002033'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/09/warning-dont-use-sicstus-402.html' title='Warning: don&apos;t use SICStus 4.0.2'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-5278882105303924134</id><published>2009-09-13T16:18:00.000-07:00</published><updated>2009-09-13T16:30:03.260-07:00</updated><title type='text'>Dialogue server now accepts XML format messages</title><content type='html'>&lt;span style="font-family:georgia;"&gt;Another of those things I should no doubt have done years ago: after a discussion with Maria, I've now modified the dialogue server so that it can also run in a mode where all messages are XML-formatted. &lt;/span&gt;&lt;br /&gt;&lt;pre style="font-family: georgia;" class="bz_comment_text" id="comment_text_1"&gt;Details (this is documented in $REGULUS/Prolog/dialogue_server.pl):&lt;br /&gt;&lt;br /&gt;- Initially, the server is in Prolog mode.&lt;br /&gt;&lt;br /&gt;- To put the server into XML mode, send the message&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;xml_messages.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;- Subsequent messages are of the form&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;xml_message(&lt;xmlstring&gt;XMLMessage).&lt;/xmlstring&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;where &lt;span&gt;&lt;span style="font-family:courier new;"&gt;XMLMessage&lt;/span&gt;&lt;/span&gt;&lt;span style="font-family:courier new;"&gt;&lt;xmlstring&gt;&lt;/xmlstring&gt;&lt;/span&gt; is an XML encoding of the corresponding Prolog message produced using the predicate &lt;span style="font-family:courier new;"&gt;prolog_xml/2&lt;/span&gt; in $REGULUS/PrologLib/prolog_xml.pl.&lt;br /&gt;The XML can be converted back into Prolog if necessary using the same predicate.&lt;br /&gt;&lt;br /&gt;I have tested by converting the CALL-SLT Prolog client to use XML-flavor messages, and  it all works fine.&lt;br /&gt;&lt;br /&gt;The routines in $REGULUS/PrologLib/prolog_xml.pl should in general be useful for translating Prolog into XML form in a reversible way. Look at the file for documentation and an example.&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-5278882105303924134?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/5278882105303924134/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=5278882105303924134' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/5278882105303924134'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/5278882105303924134'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/09/dialogue-server-now-accepts-xml-format.html' title='Dialogue server now accepts XML format messages'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-8807853484754412748</id><published>2009-09-07T02:44:00.001-07:00</published><updated>2009-09-07T02:49:06.004-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='CALL-SLT'/><category scheme='http://www.blogger.com/atom/ns#' term='Japanese'/><title type='text'>CALL-SLT and Japanese</title><content type='html'>I have just added a little more coverage to the Japanese version... it now has about a dozen sentences for the student to practice on. I tried it, and so far it still recognizes everything I say. This is probably more because vocabulary is so small than because I have a wonderful Japanese accent :)&lt;br /&gt;&lt;br /&gt;Yukie and I should talk about how to proceed here. The first step will be to add material to the Japanese corpus.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-8807853484754412748?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/8807853484754412748/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=8807853484754412748' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/8807853484754412748'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/8807853484754412748'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/09/call-slt-and-japanese.html' title='CALL-SLT and Japanese'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-4788374131543871786</id><published>2009-09-05T23:02:00.001-07:00</published><updated>2009-09-05T23:06:15.827-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='CALL-SLT'/><category scheme='http://www.blogger.com/atom/ns#' term='help'/><title type='text'>Using recorded wavfiles as help information in CALL-SLT (part 2)</title><content type='html'>I did a little more fiddling around with the translation game strategy code, and it's now possible to define a strategy where the system only chooses entries which don't have an associated wavfile. The idea is to make it easy for the teacher to add missing wavfiles.&lt;br /&gt;&lt;br /&gt;I tested it on English, and we now have a complete set of wavfiles for that language. As soon as we have a bit more coverage for French and Japanese, I'll add similar scripts for them too.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-4788374131543871786?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/4788374131543871786/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=4788374131543871786' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/4788374131543871786'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/4788374131543871786'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/09/using-recorded-wavfiles-as-help.html' title='Using recorded wavfiles as help information in CALL-SLT (part 2)'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-2969321121990049045</id><published>2009-09-04T14:00:00.000-07:00</published><updated>2009-09-04T14:12:44.066-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='CALL-SLT'/><category scheme='http://www.blogger.com/atom/ns#' term='help'/><title type='text'>Using recorded wavfiles as help information in CALL-SLT</title><content type='html'>I've got a new feature working on CALL-SLT, which allows speech input to be logged and reused as help for students. When you start the system, it asks whether or not you wish to be considered a native speaker. If you answer yes, it keeps the wavfile for each successful match, and stores it in such a way that that the wavfile is associated with the current prompt. Subsequently, if a student is given the same prompt and hits the HELP button, the native speaker's wavfile is replayed. By construction, we know that the native speaker was correctly recognized, so if the student can just imitate them well enough they should be recognized too.&lt;br /&gt;&lt;br /&gt;The idea is simple, but there were some messy technical problems... a bad interaction between Nuance and SICStus concerning relative pathnames, and the question of what happens if two different users try to check in new wavfiles simultaneously. I think I have decent solutions, though. For more details, look at the &lt;a href="http://callslt.cvs.sourceforge.net/viewvc/*checkout*/callslt/CALL-SLT/doc/UsingCALL-SLT.htm?revision=1.2"&gt;online documentation&lt;/a&gt; which I have just added. This also tells you how to download and run the system.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-2969321121990049045?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/2969321121990049045/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=2969321121990049045' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/2969321121990049045'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/2969321121990049045'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/09/progress-on-call-slt.html' title='Using recorded wavfiles as help information in CALL-SLT'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-5880626889033154776</id><published>2009-08-23T13:43:00.000-07:00</published><updated>2009-08-23T13:48:52.104-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='CALL-SLT'/><category scheme='http://www.blogger.com/atom/ns#' term='Japanese'/><title type='text'>CALL-SLT and Japanese</title><content type='html'>And we now have a skeleton Japanese system too. So far, it can only do one sentence,&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;   hitori no teeburu wa arimasu ka&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;which is rendered in the Interlingua as&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;    POLITE REQUEST TABLE 1 PERSON&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Though it is nice that this goes all the way through: I can get the Interlingua as a prompt, speak the sentence, and be informed that I got it right. Not surprisingly, since the grammar doesn't cover anything else, it's very reliable when you say the one thing it knows!&lt;br /&gt;&lt;br /&gt;Yukie and I need to get together and add more content. The first step will be for Yukie to flesh out the corpus, which currently only has a dozen or so examples.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-5880626889033154776?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/5880626889033154776/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=5880626889033154776' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/5880626889033154776'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/5880626889033154776'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/08/call-slt-and-japanese.html' title='CALL-SLT and Japanese'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-4731880774886967423</id><published>2009-08-22T22:40:00.000-07:00</published><updated>2009-08-22T22:45:59.617-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='CALL-SLT'/><category scheme='http://www.blogger.com/atom/ns#' term='French'/><title type='text'>CALL-SLT and French</title><content type='html'>We should now have a complete set of of scripts, config files etc for the French version of CALL-SLT. I added a couple of placeholder files, with enough translation rules to do the sentence "Je voudrais deux bières". The make appears to work, and I was able to run the initial version of the translation game in the server. Over to Pierrette to add some actual content!&lt;br /&gt;&lt;br /&gt;Next, Japanese...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-4731880774886967423?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/4731880774886967423/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=4731880774886967423' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/4731880774886967423'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/4731880774886967423'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/08/call-slt-and-french.html' title='CALL-SLT and French'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-6667444166450922611</id><published>2009-08-22T00:25:00.000-07:00</published><updated>2009-08-22T00:33:22.354-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='CALL-SLT'/><category scheme='http://www.blogger.com/atom/ns#' term='dialogue-server'/><title type='text'>Progress on CALL-SLT</title><content type='html'>I have the generic CALL-SLT functionality packaged up so that it can be run inside the dialogue server - this involved extending the dialogue server a little bit, so that you can now call recognition from inside it. The interface between the client and the CALL-SLT-loaded dialogue server is consequently very simple. There are so far just three commands:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;"Next prompt". The server generates a new prompt, using its current strategy, and returns it.&lt;/li&gt;&lt;li&gt;"Recognize and match". The server performs recognition, translates it to Interlingua, matches against the current prompt, and returns a string explaining what happened.&lt;/li&gt;&lt;li&gt;"Help". The server returns the current prompt, plus a text example illustrating one possible way to realize the prompt.&lt;/li&gt;&lt;/ul&gt;I have been testing with a minimal Prolog-based client. It should be easy to write a Java client which offers a nice GUI-style interface.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-6667444166450922611?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/6667444166450922611/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=6667444166450922611' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/6667444166450922611'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/6667444166450922611'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/08/progress-on-call-slt_22.html' title='Progress on CALL-SLT'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-8803351748146499900</id><published>2009-08-20T03:51:00.000-07:00</published><updated>2009-08-20T03:57:59.036-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='CALL-SLT'/><title type='text'>Progress on CALL-SLT</title><content type='html'>I have a first version of the translation game working for English! It's still very clunky indeed (I'm running it from the command-line in the development environment), but it's already kind of fun. Next step will be to package it up in a better way, using the dialogue server; this should be quite easy. Once I've done that, it'll be possible to add a Java GUI, so that we have a complete first version of the system.&lt;br /&gt;&lt;br /&gt;Of course, what I really want to do is try it in another language... I already know how to order in English restaurants! I'll start sorting out the config files and scripts for Japanese. My restaurant Japanese is shaky, and I'm very curious to see if I can use CALL-SLT to improve it.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-8803351748146499900?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/8803351748146499900/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=8803351748146499900' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/8803351748146499900'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/8803351748146499900'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/08/progress-on-call-slt_20.html' title='Progress on CALL-SLT'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-8328389684520776528</id><published>2009-08-19T09:50:00.000-07:00</published><updated>2009-08-19T09:57:56.842-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='CALL-SLT'/><title type='text'>Progress on CALL-SLT</title><content type='html'>I've spent some time messing around with the English to Interlingua translation rules and the Interlingua grammar. We're now able to translate 95% of the development corpus into sensible-looking interlingua - though the current surface form will probably be revised a bit at some point by Pierrette and Johanna.&lt;br /&gt;&lt;br /&gt;At any rate, you can now run the speech-input English system from the Regulus command-line, setting it so that spoken inputs are parsed and translated into Interlingua. I don't think it's that much extra work to add code so that we have a complete first version of the system. There will initially be two commands. You can either ask for a new Interlingua prompt, or ask to speak. If you speak, it will translate what you say into Interlingua, compare with the current prompt, and score you. With any luck, I'll have this working before next week.&lt;br /&gt;&lt;br /&gt;Here's the current English development corpus, with the Interlingua translations it produces:&lt;br /&gt;&lt;br /&gt; &lt;table str="" style="border-collapse: collapse; width: 555pt;" width="739" border="0" cellpadding="0" cellspacing="0"&gt;&lt;col style="width: 302pt;" width="402"&gt;  &lt;col style="width: 253pt;" width="337"&gt;  &lt;tbody&gt;&lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt; width: 302pt;" width="402" height="17"&gt;i will take a beer&lt;/td&gt;   &lt;td style="width: 253pt;" width="337"&gt;REQUEST beer&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;give me a beer&lt;/td&gt;   &lt;td&gt;REQUEST beer&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;hello&lt;/td&gt;   &lt;td&gt;NEUTRAL-GREETING&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;good evening&lt;/td&gt;   &lt;td&gt;EVENING-GREETING&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;i would like a table for one&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE 1 PERSON&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;could i have a table for one&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE 1 PERSON&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;do you have a table for one&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE 1 PERSON&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;i would like a table for two&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE 2 PERSON&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;could i have a table for three&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE 3 PERSON&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;do you have a table for four&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE 4 PERSON&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;do you have a table outside&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE outside&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;is there a table outside&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE outside&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;could i have a table outside&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE outside&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;could we have a table by the window&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE by-loc window&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;do you have a table near the window&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE in-loc window&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;i 'd like a table in the smoking area&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE smoking&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;i 'd like a table in the non-smoking   area&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE non-smoking&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;smoking please&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE smoking&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;non-smoking please&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE non-smoking&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;could i have a table in the corner&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE in-loc corner&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;can i have a table in the corner&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE in-loc corner&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" str="i would like a non-smoking table for one " height="17"&gt;i would like a non-smoking   table for one&lt;span style=""&gt; &lt;/span&gt;&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE 1 PERSON non-smoking&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;do you have a non-smoking table for one&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE 1 PERSON non-smoking&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;a non-smoking table for two please&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE 2 PERSON non-smoking&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;could i have a non-smoking table for   three people&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE 3 PERSON non-smoking&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;i would like a table for three people in   the smoking area&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE 3 PERSON smoking&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" str="i would like a table for four " height="17"&gt;i   would like a table for four&lt;span style=""&gt; &lt;/span&gt;&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE 4 PERSON&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;do you have a table for four&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE 4 PERSON&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;i would like to reserve a table for   seven o'clock&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE 19 00&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;could i reserve a table for seven thirty&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE 19 30&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;i 'd like to reserve a table for six   forty five&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE 18 45&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;i would like to reserve a table for two   for seven fifteen&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE 2 PERSON 19 15&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;could i reserve a table for two people   for eight o'clock&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE 2 PERSON 20 00&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;do you have a table for two at seven   thirty&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE 2 PERSON 19 30&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;could i reserve a table for seven   o'clock tomorrow evening&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE 19 00 time-evening date-tomorrow&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;i would like to reserve a table for six   forty five tomorrow please&lt;/td&gt;   &lt;td&gt;parsing_failed&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;could i reserve a table for two for   tomorrow evening&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE 2 PERSON time-evening date-tomorrow&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;could i reserve a table for this evening&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE time-evening date-today&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;do you have a table for three tomorrow   evening around seven o'clock&lt;/td&gt;   &lt;td&gt;POLITE REQUEST TABLE 3 PERSON 19 00 time-evening date-tomorrow&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;i have a reservation in the name of   smith&lt;/td&gt;   &lt;td&gt;parsing_failed&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;i should have a reservation in the name   of smith&lt;/td&gt;   &lt;td&gt;parsing_failed&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;could we see the menu&lt;/td&gt;   &lt;td&gt;POLITE REQUEST menu&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;i would like to see the menu&lt;/td&gt;   &lt;td&gt;POLITE REQUEST menu&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;could we get the menu&lt;/td&gt;   &lt;td&gt;POLITE REQUEST menu&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;could we have the bill&lt;/td&gt;   &lt;td&gt;POLITE REQUEST check&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;could we have the bill please&lt;/td&gt;   &lt;td&gt;POLITE REQUEST check&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;may i have the bill&lt;/td&gt;   &lt;td&gt;POLITE REQUEST check&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;may i have the check&lt;/td&gt;   &lt;td&gt;POLITE REQUEST check&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;could you give us the check please&lt;/td&gt;   &lt;td&gt;POLITE REQUEST check&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;could you give me a receipt&lt;/td&gt;   &lt;td&gt;POLITE REQUEST receipt&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;i would like a receipt please&lt;/td&gt;   &lt;td&gt;POLITE REQUEST receipt&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;could i get a receipt please&lt;/td&gt;   &lt;td&gt;POLITE REQUEST receipt&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;i would like a beer&lt;/td&gt;   &lt;td&gt;POLITE REQUEST beer&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;could we have two beers&lt;/td&gt;   &lt;td&gt;POLITE REQUEST 2 beer&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;could you give us two beers&lt;/td&gt;   &lt;td&gt;POLITE REQUEST 2 beer&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;could i have a latte&lt;/td&gt;   &lt;td&gt;POLITE REQUEST latte&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;i 'd like a medium latte&lt;/td&gt;   &lt;td&gt;POLITE REQUEST medium latte&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;a cup of tea please&lt;/td&gt;   &lt;td&gt;POLITE REQUEST cup tea&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;two cups of tea please&lt;/td&gt;   &lt;td&gt;POLITE REQUEST 2 cup tea&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;could we get two glasses of water&lt;/td&gt;   &lt;td&gt;POLITE REQUEST 2 glass water&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;i 'd like a glass of the house red&lt;/td&gt;   &lt;td&gt;POLITE REQUEST glass house-red-wine&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;could we have two glasses of the house   red&lt;/td&gt;   &lt;td&gt;POLITE REQUEST 2 glass house-red-wine&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;i would like a large glass of white wine&lt;/td&gt;   &lt;td&gt;POLITE REQUEST large glass white-wine&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;i would like two small glasses of white   wine&lt;/td&gt;   &lt;td&gt;POLITE REQUEST 2 small glass white-wine&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;i would like a pizza&lt;/td&gt;   &lt;td&gt;POLITE REQUEST pizza&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;i would like a hamburger&lt;/td&gt;   &lt;td&gt;POLITE REQUEST hamburger&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;i would like the soup&lt;/td&gt;   &lt;td&gt;POLITE REQUEST soup&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td style="height: 12.75pt;" height="17"&gt;i would like two pizzas&lt;/td&gt;   &lt;td&gt;POLITE REQUEST 2 pizza&lt;/td&gt;  &lt;/tr&gt; &lt;/tbody&gt;&lt;/table&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-8328389684520776528?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/8328389684520776528/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=8328389684520776528' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/8328389684520776528'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/8328389684520776528'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/08/progress-on-call-slt_19.html' title='Progress on CALL-SLT'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-2209958375122706630</id><published>2009-08-18T14:30:00.000-07:00</published><updated>2009-08-18T14:34:31.394-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='CALL-SLT'/><title type='text'>Progress on CALL-SLT</title><content type='html'>I've now got enough stuff working in CALL-SLT that it's possible to translate a few simple sentences from spoken English to Interlingua.  Here's an example:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&gt;&gt; i would like two beers&lt;br /&gt;&lt;br /&gt;Source: i would like two beers&lt;br /&gt;Target: POLITE REQUEST two beer&lt;br /&gt;Other info:&lt;br /&gt;n_parses = 1&lt;br /&gt;parse_time = 0.047&lt;br /&gt;source_representation = [(agent=[pronoun,i]), (null=[action,like]), (null=[modal,would]),&lt;br /&gt;                         (null=[utterance_type,dcl]), (null=[voice,active]),&lt;br /&gt;                         (object=[drink,beer]), (object=[spec,2])]&lt;br /&gt;transfer_to_source_discourse_time = 0.0&lt;br /&gt;source_discourse = [(null=[utterance_type,dcl]), (agent=[pronoun,i]), (null=[voice,active]),&lt;br /&gt;                    (null=[modal,would]), (null=[action,like]), (object=[spec,2]),&lt;br /&gt;                    (object=[drink,beer])]&lt;br /&gt;resolved_source_discourse = [(null=[utterance_type,dcl]), (agent=[pronoun,i]),&lt;br /&gt;                             (null=[voice,active]), (null=[modal,would]), (null=[action,like]),&lt;br /&gt;                             (object=[spec,2]), (object=[drink,beer])]&lt;br /&gt;resolution_processing = trivial&lt;br /&gt;resolution_time = 0.0&lt;br /&gt;transfer_to_interlingua_time = 0.0&lt;br /&gt;interlingua = [(arg2=[drink,beer]), (arg2=[number,2]), (null=[politeness,polite]),&lt;br /&gt;               (null=[utterance_type,request])]&lt;br /&gt;interlingua_surface = POLITE REQUEST two beer&lt;br /&gt;interlingua_checking_time = 0.0&lt;br /&gt;&lt;br /&gt;--- Performed command i would like two beers, time = 0.05 seconds&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;We are not far from having everything we need to be able to build and run an initial version of the CALL-SLT server. Originally, it will prompt in the surface interlingua. but I'll leave in a hook so that we can use the picture interlingua as soon as we have something available.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-2209958375122706630?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/2209958375122706630/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=2209958375122706630' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/2209958375122706630'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/2209958375122706630'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/08/progress-on-call-slt.html' title='Progress on CALL-SLT'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-1515817210999257526</id><published>2009-08-16T07:33:00.000-07:00</published><updated>2009-08-16T07:39:01.665-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='CALL-SLT'/><title type='text'>CALL-SLT corpus</title><content type='html'>For people who don't already know, we are collecting our initial corpus for CALL-SLT on a Google Docs document. If you don't already have access to this document and would like to contribute, please get a Google account and mail me your Google address.&lt;br /&gt;&lt;br /&gt;All contributions very gratefully received - and don't worry that you might break something accidentally, Google Docs has excellent facilities for tracking revisions and if necessary reverting to earlier versions.&lt;br /&gt;&lt;br /&gt;Here's a sample extract showing what the corpus looks like:&lt;br /&gt;&lt;br /&gt;Fre: auriez-vous une table sur la terrasse (SVP) ?&lt;br /&gt;Fre: auriez-vous une table en terrasse ?&lt;br /&gt;Eng: do you have a table outside&lt;br /&gt;Eng: is there a table outside&lt;br /&gt;Eng: could i have a table outside&lt;br /&gt;Int: POLITE REQUEST LOCATION-TABLE LOCATION-OUTSIDE [note: is "en terrasse" different from "outside"?] &lt;p&gt;Fre: auriez-vous une table près de la fenêtre (SVP) ?&lt;br /&gt;Fre: auriez-vous une table à côté de la fenêtre (SVP) ?&lt;br /&gt;Jap: mado gawa no seki wa ari masu ka&lt;br /&gt;Eng: could we have a table by the window&lt;br /&gt;Eng: do you have a table near the window&lt;br /&gt;Int: POLITE REQUEST LOCATION-TABLE NEAR WINDOW&lt;/p&gt; &lt;p&gt;Fre: auriez-vous une table non-fumeur/fumeur (SVP)?&lt;br /&gt;Fre: auriez-vous une table dans la zone non-fumeur /fumeur (SVP)?&lt;br /&gt;Fre: auriez-vous une table dans la section non-fumeur/fumeur (SVP)?&lt;br /&gt;Jap: kinenseki wa ari masu ka&lt;br /&gt;Eng: i 'd like a table in the smoking area&lt;br /&gt;Eng: i 'd like a table in the non-smoking area&lt;br /&gt;Eng: smoking please&lt;br /&gt;Eng: non-smoking please&lt;br /&gt;Int: POLITE REQUEST LOCATION-TABLE IN SMOKING-AREA&lt;br /&gt;Int: POLITE REQUEST LOCATION-TABLE IN NON-SMOKING-AREA&lt;/p&gt; &lt;p&gt;Fre: auriez-vous une table dans le coin (SVP) ?&lt;br /&gt;Jap: oku no seki wa ari masu ka&lt;br /&gt;Jap: oku no teeburu wa ari masu ka&lt;br /&gt;Eng: could i have a table in the corner&lt;br /&gt;Int: POLITE REQUEST LOCATION-TABLE IN CORNER&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-1515817210999257526?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/1515817210999257526/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=1515817210999257526' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/1515817210999257526'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/1515817210999257526'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/08/call-slt-corpus.html' title='CALL-SLT corpus'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-8222982433698779047</id><published>2009-08-16T06:49:00.000-07:00</published><updated>2009-08-16T07:26:34.520-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='CALL-SLT'/><title type='text'>First CALL-SLT meeting</title><content type='html'>We've had our first CALL-SLT meeting, which has done a lot to clarify our immediate goals for the project. We're going to start by building a simple version of the system, constructed in such a way that it will be easy to upgrade it by successively replacing simple modules by more complex ones. Initially, we will be working in the tourism/restaurant domain, using the languages English, French and Japanese. When we have those working well enough, we'll also start on Chinese; this is a language none of us know, so it will give us intuitions about what it's like to be an elementary-level student trying to use the system to get some fluency in a language.&lt;br /&gt;&lt;br /&gt;The initial prototype will work as follows. At each turn, the system prompts the student with a description of what they are supposed to say, formulated in a version of the Interlingua. The student will attempt to speak it in the L2 (the language they are trying to learn). The system applies speech recognition to the student's utterance, then tries to translate the result into the interlingua. Finally, it compares the translated interlingua with the one used to prompt the student, and gives them feedback on how they did. Here are more details:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Prompting in Interlingua. In the first version, the interlingua will be shown to the student in a text-based form, using the methods we've developed under MedSLT. So for example, the system might show the student&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;POLITE REQUEST TABLE 3 PERSON TIME 19:30&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;expecting the student to say something like&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;I would like to reserve a table for three people at seven thirty&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;or whatever the equivalent is in the L2 they are using.&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;As soon as we've figured out a good way to do it, we would like to be able to present the interlingua prompt in graphical form. So here, we might have a picture that could be described as&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Scene:&lt;br /&gt;Client is talking to waiter.&lt;br /&gt;Speech bubble from client.&lt;br /&gt;Inside speech bubble:&lt;br /&gt;  three chairs around a restaurant table;&lt;br /&gt;  large clock in background shows 19:30&lt;br /&gt;&lt;/pre&gt;&lt;/li&gt;&lt;li&gt;All speech input to the system will be logged in the usual way. We will have a registration process which allows us to associate each recorded utterance with meta-data which in particular will specify whether or not the utterance was recorded by a native speaker, and whether or not speech recognition got it right.&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;When the system has compared the student's interlingua with the prompt interlingua, there are two simple ways for it to give helpful feedback. The first is to present both versions of the interlingua, highlighting the elements that are different. For instance, in the example above, if the system recognized&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Could I have a table for two people at seven thirty&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;then the system would present the prompt and recognized interlinguas roughly as follows:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;POLITE REQUEST TABLE &lt;span style="font-weight: bold;"&gt;*3*&lt;/span&gt; PERSON TIME 19:30&lt;br /&gt;POLITE REQUEST TABLE &lt;span style="font-weight: bold;"&gt;*2&lt;/span&gt;* PERSON TIME 19:30&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;The second way to give help will be to play an example of a native speaker saying some version of the sentence in the L2, if such an example already exists.&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;The prompt selection module will have hooks allowing specification of a strategy. A simple strategy we will implement soon is to choose the prompt from a list of examples where there is a recorded example of a native speaker saying the prompt in the L2, possibly with some other constraints. This will make it easy for a teacher to create a lesson. They will first interact with the system in the L2, to create a set of recorded examples which work correctly. When the student logs on, the system will then be set to select prompts matching the teacher's examples.&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;The functionality will be bundled up as a Prolog-based server, which does most of the processing, and will connect to a lightweight Java-based GUI which presents a client view. The server will initially handle two messages: (1) NEXT_EXAMPLE, returning a new interlingua prompt with associated information, and (2) RECOGNISE, prompting the student to speak, carrying out recognition, and returning the pieces of information produced by carrying out the interlingua comparison process.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-8222982433698779047?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/8222982433698779047/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=8222982433698779047' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/8222982433698779047'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/8222982433698779047'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/08/first-call-slt-meeting.html' title='First CALL-SLT meeting'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-3224458387503470598</id><published>2009-08-10T09:37:00.000-07:00</published><updated>2009-08-10T09:43:25.097-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='CALL-SLT'/><category scheme='http://www.blogger.com/atom/ns#' term='English'/><title type='text'>Initial English recognizer for CALL-SLT (Part 2)</title><content type='html'>I have done some more work on the English CALL-SLT recognizer, and we now have about 170 surface words. You can order food and drink in various ways, e.g.&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;i would like two beers&lt;br /&gt;could i have a pizza&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;There is language for reserving tables, e.g.&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;do you have a table for two at seven thirty&lt;br /&gt;could i have a non-smoking table for three people&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;I've also added a few more things like asking for the menu and the check. If I could say all this stuff in a language I didn't already know, say Chinese, I'd really feel I'd learned something useful!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-3224458387503470598?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/3224458387503470598/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=3224458387503470598' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/3224458387503470598'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/3224458387503470598'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/08/initial-english-recognizer-for-call-slt_10.html' title='Initial English recognizer for CALL-SLT (Part 2)'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-264842318703214793</id><published>2009-08-10T05:44:00.000-07:00</published><updated>2009-08-10T05:52:59.946-07:00</updated><title type='text'>Initial English recognizer for CALL-SLT</title><content type='html'>We can now build an initial English recognizer for CALL-SLT too. The domain is the same, ordering in restaurants. The training corpus currently contains about 60 examples, about 95% of which parse. Vocabulary is about 130 surface words. Recognition is anecdotally quite good with my voice, though it will of course be more interesting to see how foreign voices do.&lt;br /&gt;&lt;br /&gt;If you want more details, here are the &lt;a href="http://callslt.cvs.sourceforge.net/viewvc/callslt/CALL-SLT/Eng/corpora/callslt_sents_combined_domains.pl"&gt;corpus&lt;/a&gt; and the &lt;a href="http://callslt.cvs.sourceforge.net/viewvc/callslt/CALL-SLT/Eng/Regulus/callslt_lex.regulus"&gt;domain lexicon&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-264842318703214793?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/264842318703214793/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=264842318703214793' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/264842318703214793'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/264842318703214793'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/08/initial-english-recognizer-for-call-slt.html' title='Initial English recognizer for CALL-SLT'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-7816392988485399719</id><published>2009-08-07T12:25:00.000-07:00</published><updated>2009-08-07T12:39:02.909-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='CALL-SLT'/><category scheme='http://www.blogger.com/atom/ns#' term='French'/><title type='text'>Initial French recognizer for CALL-SLT</title><content type='html'>Pierrette's now checked in some real (as opposed to placeholder) material, and I was able to compile an initial French recognizer for CALL-SLT. It covers some very basic tourist French, so far all about ordering in restaurants. Simple as it is, I was already able to use it to improve my pronunciation of "Je voudrais un verre d'eau". As everyone who's heard me speak French knows, my version of the "r" sound is terrible. Well, at least I can roll my eyes.&lt;br /&gt;&lt;br /&gt;If people want to look at the CALL-SLT files, they're at &lt;a href="http://callslt.cvs.sourceforge.net/viewvc/callslt/CALL-SLT/"&gt;http://callslt.cvs.sourceforge.net/viewvc/callslt/CALL-SLT/&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-7816392988485399719?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/7816392988485399719/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=7816392988485399719' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/7816392988485399719'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/7816392988485399719'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/08/initial-french-recognizer-for-call-slt.html' title='Initial French recognizer for CALL-SLT'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-457198616517609315</id><published>2009-08-04T00:34:00.000-07:00</published><updated>2009-08-04T00:38:46.429-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Romance'/><category scheme='http://www.blogger.com/atom/ns#' term='CALL-SLT'/><title type='text'>Adding Romance to Regulus (part 2)</title><content type='html'>I've just checked in initial French files under CALL-SLT/Fre. Update CALLSLT with the -d flag to get them.&lt;br /&gt;&lt;br /&gt;So far, the CALL-SLT files are pretty much the same as the MedSLT files they were adapted from. I'm assuming Pierrette will make the necessary changes! I checked that you can get as far as building a Nuance grammar... it all worked fine for me, but let me know if there are problems.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-457198616517609315?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/457198616517609315/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=457198616517609315' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/457198616517609315'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/457198616517609315'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/08/adding-romance-to-regulus-part-2.html' title='Adding Romance to Regulus (part 2)'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-3399198059377520579</id><published>2009-08-03T22:56:00.000-07:00</published><updated>2009-08-03T23:03:54.104-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Romance'/><category scheme='http://www.blogger.com/atom/ns#' term='grammar'/><title type='text'>Adding Romance to Regulus</title><content type='html'>Following discussions with Pierrette, I've now moved the shared Romance grammar from MedSLT/Rom to the new directory Regulus/Grammar/Romance. I've also moved the domain-independent French grammar and lexicon files to Regulus/Grammar/Romance/French. This mirrors the directory structure in Jen's Germanic grammar.&lt;br /&gt;&lt;br /&gt;I've adjusted the MedSLT config files for Fre, Spa and Cat to point to the new files. All three languages still appear to build correctly in the AFF (role-marked semantics) versions. I have not done anything with the old (linear semantics) versions, which I am now assuming are obsolete. &lt;br /&gt;&lt;br /&gt;All three languages appeared to build fine when I tested, but Pierrette will probably want to check things more carefully. If there are problems, please me know - they should be easy to fix. &lt;br /&gt;&lt;br /&gt;The payoff is that I will now be able to construct a config file for the French version of CALL-SLT. Coming up next.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-3399198059377520579?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/3399198059377520579/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=3399198059377520579' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/3399198059377520579'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/3399198059377520579'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/08/adding-romance-to-regulus.html' title='Adding Romance to Regulus'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-6832648027456544241</id><published>2009-07-26T07:42:00.000-07:00</published><updated>2009-07-26T07:46:07.983-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='treebanking'/><title type='text'>Treebank caching and preferences</title><content type='html'>Pierrette reminded me last week that there was a known problem in treebank caching: when parse preferences change, cached analyses may no longer be valid. &lt;br /&gt;&lt;br /&gt;I've just checked in new code which stores the preferences used to create the treebank, and compares them with the current preferences. If the two are different, the treebank is regenerated. It would be nice if we could only regenerate the part that might be affected by the changed preferences, but that's unfortunately very difficult to do.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-6832648027456544241?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/6832648027456544241/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=6832648027456544241' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/6832648027456544241'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/6832648027456544241'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/07/treebank-caching-and-preferences.html' title='Treebank caching and preferences'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-1175040237586405979</id><published>2009-07-20T08:00:00.001-07:00</published><updated>2009-07-20T08:03:17.605-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='DORIS'/><title type='text'>Regulus for DORIS, continued</title><content type='html'>I did some more tweaking of the DORIS grammar, and it now covers about 90% of Patrick's corpus. The Nuance grammar it generates is nice and compact (less than 1000 rules), and many of the remaining coverage holes still look easy to solve. As we hoped, this seems to be a good domain for Regulus.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-1175040237586405979?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/1175040237586405979/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=1175040237586405979' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/1175040237586405979'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/1175040237586405979'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/07/regulus-for-doris-continued.html' title='Regulus for DORIS, continued'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-7907625224045607538</id><published>2009-07-20T00:03:00.000-07:00</published><updated>2009-07-20T00:07:38.943-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='SHRD2'/><category scheme='http://www.blogger.com/atom/ns#' term='DORIS'/><title type='text'>Regulus for DORIS</title><content type='html'>When I was visiting Melbourne Uni last week, I talked with Patrick Ye about the possibility of using Regulus to provide speech recognition for  the DORIS project, which, as he pointed out, is in fact quite similar to  SHRD2; in both domains, the basic idea is to find things, pick them up,  and move them around. I did indeed find it very easy to use the examples  corpus and vocabulary that Patrick sent me to adapt the existing SHRD2  resources, and in just a few hours put together an initial Regulus  grammar that could be compiled into a recogniser. The current version of  the grammar covers a bit more than 80% of Patrick's corpus, and the  recogniser can turn spoken sentences in Australian-accented English into  either strings of words or scoped logical forms. For example, here's the  representation it produces of "the red book is on the desk":&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;[[dcl,&lt;br /&gt;  quant(def_sing, A, [[book,A],[color,A,red]],&lt;br /&gt;        quant(def_sing, B,&lt;br /&gt;              [[desk,B]],&lt;br /&gt;              quant(exist,C,[[be_on_loc,C,A,B],[tense,C,present]],true)))]]&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;If people want to look at the details, the files are checked in at  &lt;a class="moz-txt-link-freetext" href="http://regulus.cvs.sourceforge.net/viewvc/regulus/Regulus/Examples/Doris/"&gt;http://regulus.cvs.sourceforge.net/viewvc/regulus/Regulus/Examples/Doris/&lt;/a&gt;.  The interesting ones are the lexicon, at  &lt;a class="moz-txt-link-freetext" href="http://regulus.cvs.sourceforge.net/viewvc/regulus/Regulus/Examples/Doris/Regulus/doris_lex.regulus"&gt;http://regulus.cvs.sourceforge.net/viewvc/regulus/Regulus/Examples/Doris/Regulus/doris_lex.regulus&lt;/a&gt;,  and the corpus, at  &lt;a class="moz-txt-link-freetext" href="http://regulus.cvs.sourceforge.net/viewvc/regulus/Regulus/Examples/Doris/corpora/doris_corpus.pl"&gt;http://regulus.cvs.sourceforge.net/viewvc/regulus/Regulus/Examples/Doris/corpora/doris_corpus.pl&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-7907625224045607538?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/7907625224045607538/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=7907625224045607538' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/7907625224045607538'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/7907625224045607538'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/07/regulus-for-doris.html' title='Regulus for DORIS'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-4365784818817004681</id><published>2009-07-10T03:43:00.000-07:00</published><updated>2009-07-10T04:04:23.280-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Bridge'/><title type='text'>First steps in Bridge system</title><content type='html'>With help from Cathy Chua, our Bridge expert, I've been putting together a first version of a grammar for the Bridge domain this week. Cathy has been supplying vocabulary and a corpus of examples showing how to use it, and I've used that to build an initial lexicon. The specialized grammar derived from these resources now has a vocabulary of about 220 surface words. Here are some typical utterances it can already handle:&lt;br /&gt;&lt;br /&gt;who has the queen of clubs&lt;br /&gt;who bid one no trump&lt;br /&gt;is two clubs a transfer&lt;br /&gt;cover the ten with the jack&lt;br /&gt;can you finesse in diamonds&lt;br /&gt;can you make if spades are four one&lt;br /&gt;&lt;br /&gt;We tried compiling a recognizer using the Australian English package, and, with Cathy's Australian voice, recognition is anecdotally quite good. (I have discovered that the difference between British English and Australian English is substantial). We will be adding more coverage over the weekend. Next week, I hope we'll be able to start thinking concretely about how to hook up the Regulus components with BASSINET, Leon Stirling's Bridge program, to produce a first cut at an end-to-end system that can respond to spoken questions and commands.&lt;br /&gt;&lt;br /&gt;As we expected, the Bridge domain is quite a lot more more complex than anything we have tried so far in Regulus. It will definitely stretch us!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-4365784818817004681?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/4365784818817004681/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=4365784818817004681' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/4365784818817004681'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/4365784818817004681'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/07/first-steps-in-bridge-system.html' title='First steps in Bridge system'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-4790105971196004619</id><published>2009-06-09T17:40:00.000-07:00</published><updated>2009-06-09T17:49:45.065-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='paraphrasing'/><category scheme='http://www.blogger.com/atom/ns#' term='SHRD2'/><title type='text'>Update on SHRD2 paraphrasing</title><content type='html'>I extended the paraphrasing capabilities for SHRD2, and we now get paraphrases for about 45% of the 180-ish examples in the current corpus. Beth Ann and I did a little experiment for the paper we presented at the SETQA-NLP workshop in Colorado last week. We each tried judging the corpus examples for correctness, using both the paraphrases and the underlying representations. In order to minimize learning effects, we permuted the order in which we did things.&lt;br /&gt;&lt;br /&gt;Even though the "logic English" paraphrases seem very similar to the scoped logical representations, it in fact turns out that judging paraphrases is a lot faster. Even for me, knowing all the representations from having worked on them, judging paraphrases took 29 minutes, against 22 minutes for judging structures. For Beth Ann, who didn't know the datastructures previously, paraphrases were more than twice as quick. Part of the payoff comes from the fact that the paraphrase grammar acts as a filter; most ill-formed structures produce no paraphrase, hence don't need to be judged at all when paraphrases are used.&lt;br /&gt;&lt;br /&gt;One person in the workshop audience said he was pleased to see a paper about software engineering which actually contained an experiment! Beth Ann was clearly right to insist that we do this, and work out the methodology.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-4790105971196004619?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/4790105971196004619/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=4790105971196004619' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/4790105971196004619'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/4790105971196004619'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/06/update-on-shrd2-paraphrasing.html' title='Update on SHRD2 paraphrasing'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-8456702738036149862</id><published>2009-05-24T07:20:00.000-07:00</published><updated>2009-05-24T07:29:15.688-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='paraphrasing'/><category scheme='http://www.blogger.com/atom/ns#' term='SHRD2'/><title type='text'>First version of paraphrase grammar for SHRD2</title><content type='html'>I've just added a first version of a paraphrase grammar for SHRD2, where it tries to realize the scoped LFs in "logic English". Text example below. It doesn't do much more than this yet, but now that I have the basic structure working I'm hoping that it will be easy to add coverage. Planning to continue with this later today!&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&gt;&gt; pick up a big red block&lt;br /&gt;&lt;br /&gt;      Old state: []&lt;br /&gt;             LF: [[imp, &lt;br /&gt;                   form(imperative, &lt;br /&gt;                        [[pick_up,term(pro,you,[]),term(a,block,[[size,big],[color,red]]),up]])]]&lt;br /&gt; Intermediate 1: [[imp, &lt;br /&gt;                   scoping_unit([modal, imperative, &lt;br /&gt;                                 term(event_exists, A, &lt;br /&gt;                                      [[pick_up, A, term(pro,B,[[you,B]]), &lt;br /&gt;                                        term(a,C,[[block,C],[size,C,big],[color,C,red]]), up]])])]]&lt;br /&gt; Intermediate 2: [[imp, &lt;br /&gt;                   quant(pro, A, [[you,A]], &lt;br /&gt;                         quant(a, B, &lt;br /&gt;                               [[block,B],[size,B,big],[color,B,red]], &lt;br /&gt;                               imperative(quant(event_exists,C,[[pick_up,C,A,B,up]],true))))]]&lt;br /&gt; Intermediate 3: [[imp, &lt;br /&gt;                   quant(pro, A, [[you,A]], &lt;br /&gt;                         quant(a, B, &lt;br /&gt;                               [[block,B],[size,B,big],[color,B,red]], &lt;br /&gt;                               imperative(quant(event_exists,C,[[pick_up,C,A,B]],true))))]]&lt;br /&gt;  Dialogue move: [[imp, &lt;br /&gt;                   quant(exist, A, [[you,A]], &lt;br /&gt;                         quant(exist, B, &lt;br /&gt;                               [[block,B],[size,B,big],[color,B,red]], &lt;br /&gt;                               imperative(quant(exist,C,[[pick_up,C,A,B]],true))))]]&lt;br /&gt;     Paraphrase: COMMAND there is an A SUCH THAT you are A AND there is a B SUCH THAT B is a block AND B is big AND B is red AND MAKE IT TRUE THAT there is a C SUCH THAT C is that A picks up B&lt;br /&gt;Abstract action: say(i_dont_understand,present)&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-8456702738036149862?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/8456702738036149862/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=8456702738036149862' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/8456702738036149862'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/8456702738036149862'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/05/first-version-of-paraphrase-grammar-for.html' title='First version of paraphrase grammar for SHRD2'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-2837736433361959682</id><published>2009-05-21T15:52:00.000-07:00</published><updated>2009-05-21T15:59:14.182-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='top-level'/><category scheme='http://www.blogger.com/atom/ns#' term='SHRD2'/><category scheme='http://www.blogger.com/atom/ns#' term='dialogue'/><title type='text'>Multiple processing stages in dialogue top-level</title><content type='html'>When building dialogue applications, it's quite often the case that there are multiple stages in the process of converting the LF (the thing that comes out of the recognizer) into a dialogue move. I've just added some hooks so that you can pass back the intermediate levels of representation as an optional extra argument to lf_to_dialogue_move, and display them. This is useful for debugging.&lt;br /&gt;&lt;br /&gt;Here's a text example from SHRD2:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&gt;&gt; is the large red block in the box&lt;br /&gt;&lt;br /&gt;      Old state: []&lt;br /&gt;             LF: [[ynq, &lt;br /&gt;                   form(present, &lt;br /&gt;                        [[be, term(the_sing,block,[[size,large],[color,red]]), &lt;br /&gt;                          [in_loc,term(the_sing,box,[])]]])]]&lt;br /&gt; Intermediate 1: [[ynq, &lt;br /&gt;                   scoping_unit(term(event_exists, A, &lt;br /&gt;                                     [[be, A, &lt;br /&gt;                                       term(the_sing, B, &lt;br /&gt;                                            [[block,B],[size,B,large],[color,B,red]]),&lt;br /&gt;                                       [in_loc,term(the_sing,C,[[box,C]])]],&lt;br /&gt;                                      [tense,A,present]]))]]&lt;br /&gt; Intermediate 2: [[ynq, &lt;br /&gt;                   quant(the_sing, A, [[block,A],[size,A,large],[color,A,red]], &lt;br /&gt;                         quant(the_sing, B, [[box,B]], &lt;br /&gt;                               quant(event_exists, C, &lt;br /&gt;                                     [[be,C,A,[in_loc,B]],[tense,C,present]], true)))]]&lt;br /&gt;  Dialogue move: [[ynq, &lt;br /&gt;                   quant(the_sing, A, [[block,A],[size,A,big],[color,A,red]], &lt;br /&gt;                         quant(the_sing, B, [[box,B]], &lt;br /&gt;                               quant(event_exists,C,[[be_in_loc,C,A,B],[tense,C,present]],true)))]]&lt;br /&gt;Abstract action: say(i_dont_understand,present)&lt;br /&gt;Concrete action: tts(sorry, I don't understand)&lt;br /&gt;      New state: []&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Intermediate 1 is after addition of variables; Intermediate 2 is after scoping; and Dialogue move is after rewriting of lexical predicates. This last step is still very primitive.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-2837736433361959682?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/2837736433361959682/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=2837736433361959682' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/2837736433361959682'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/2837736433361959682'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/05/multiple-processing-stages-in-dialogue.html' title='Multiple processing stages in dialogue top-level'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-8800750800395684047</id><published>2009-05-20T07:20:00.000-07:00</published><updated>2009-05-20T07:30:06.525-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='SHRD2'/><title type='text'>"Pick up a big red block", revisited</title><content type='html'>OK, I did some more work on SHRD2, and, as of a few minutes ago, I managed to speak a sentence and get it turned into a logic-like representation. Here's a slightly edited trace (the less interesting parts of the output have been removed). I did indeed say "pick up a big red block"!&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&gt;&gt; RECOGNISE&lt;br /&gt;(Take next loop input from live speech)&lt;br /&gt;&lt;br /&gt;Recognised: recognition_succeeded([rec_result(62,pick up a big red block,...)&lt;br /&gt;&lt;br /&gt;      Old state: []&lt;br /&gt;             LF: [[imp, &lt;br /&gt;                   form(imperative, &lt;br /&gt;                        [[pick_up,term(pro,you,[]),term(a,block,[[size,big],[color,red]]),up]])]]&lt;br /&gt;  Dialogue move: [[imp, &lt;br /&gt;                   quant(pro, A, [[you,A]], &lt;br /&gt;                         quant(a, B, &lt;br /&gt;                               [[block,B],[size,B,big],[color,B,red]], &lt;br /&gt;                               imperative(quant(event_exists,C,[[pick_up,C,A,B,up]],true))))]]&lt;br /&gt;Abstract action: say(i_dont_understand,present)&lt;br /&gt;Concrete action: tts(sorry, I don't understand)&lt;br /&gt;      New state: []&lt;br /&gt;&lt;br /&gt;Dialogue processing time: 0.01 seconds&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Obviously, there's plenty more left to do before we have anything resembling a complete system. In particular, there is as yet no reference resolution or dialogue management, so it can't react to the commands and questions in any way. But, all things considered, I think we're making reasonable progress.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-8800750800395684047?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/8800750800395684047/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=8800750800395684047' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/8800750800395684047'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/8800750800395684047'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/05/pick-up-big-red-block-revisited.html' title='&quot;Pick up a big red block&quot;, revisited'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-7411876455443319996</id><published>2009-05-17T13:41:00.000-07:00</published><updated>2009-05-17T13:50:44.800-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='top-level'/><title type='text'>CAT and FEAT commands</title><content type='html'>One of the good things about doing some grammar development on SHRD2 is that it suggests ideas for new development environment functionality. I got tired of consulting the grammar every time I needed to find out what features a category had, or what possible values a feature could take. So I added a couple of new commands called CAT and FEAT, which give you those pieces of information. Here's an example:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&gt;&gt; CAT p&lt;br /&gt;(Display information for specified category)&lt;br /&gt;&lt;br /&gt;Features for category "p": [def,obj_sem_n_type,postposition,sem,sem_p_type,sem_pp_type]&lt;br /&gt;&lt;br /&gt;--- Performed command CAT p, time = 0.00 seconds&lt;br /&gt;&lt;br /&gt;&gt;&gt; FEAT sem_p_type         &lt;br /&gt;(Display information for specified feature)&lt;br /&gt;&lt;br /&gt;Feature values for feature "sem_p_type": [[back,down,none,normal,off,onoff,over,up,updown]]&lt;br /&gt;&lt;br /&gt;--- Performed command FEAT sem_p_type, time = 0.00 seconds&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-7411876455443319996?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/7411876455443319996/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=7411876455443319996' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/7411876455443319996'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/7411876455443319996'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/05/cat-and-feat-commands.html' title='CAT and FEAT commands'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-4170706876579566955</id><published>2009-05-17T07:43:00.000-07:00</published><updated>2009-05-17T07:53:02.748-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='SHRD2'/><title type='text'>SHRD2</title><content type='html'>I was talking with some people about Terry Winograd's book &lt;span style="font-style: italic;"&gt;Understanding Natural Language&lt;/span&gt;, and how inspiring I found the SHRDLU system when I first read about it as an undergraduate. They asked me what the current equivalent would be. It bothered me that I couldn't come up with anything, and that no one really seemed to be building this kind of system any more.&lt;br /&gt;&lt;br /&gt;Well... so why not do something about it? Stage 1 is to build a speech-enabled, and hopefully rather more robust reconstruction of SHRDLU, using Regulus. Some initial stuff is already checked in under $REGULUS/Examples/SHRD2. When we've satisfied ourselves that we can handle the Blocks World, Stage 2 will be to define a new and more ambitious domain - something which will hopefully demonstrate that there has in fact been significant progress since the 70s.&lt;br /&gt;&lt;br /&gt;I'll be posting more about this soon. So far, SHRD2 has already turned up some important holes in the general English grammar, which I'm working on fixing.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-4170706876579566955?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/4170706876579566955/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=4170706876579566955' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/4170706876579566955'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/4170706876579566955'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/05/shrd2.html' title='SHRD2'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-2083694562201515597</id><published>2009-05-17T06:27:00.000-07:00</published><updated>2009-05-17T13:51:04.051-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='top-level'/><title type='text'>Showing missing vocabulary in EBL_TREEBANK</title><content type='html'>A trivial but rather useful little feature I just added: when you run EBL_TREEBANK, you now get missing vocabulary displayed for relevant sentences. I don't know why I didn't do this years ago. Here's an example. To get the new functionality, you just need to update Regulus from CVS, nothing needs to be remade.&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&gt;&gt; EBL_TREEBANK&lt;br /&gt;(Parse all sentences in current EBL training set into treebank form)&lt;br /&gt;&lt;br /&gt;--- Read parsing history file (114 records) d:/cygwin/home/speech/regulus/examples/shrd2/generated/shrd2_parsing_history.pl&lt;br /&gt;--- Incremental treebanking switched off, not trying to convert treebank&lt;br /&gt;&lt;br /&gt;Parsing corpus data in d:/cygwin/home/speech/regulus/examples/shrd2/corpora/shrdlu_corpus.pl:&lt;br /&gt;..&lt;br /&gt;*** Parsing failed for: "find a block which is taller than the one you are holding and put it into the box", line 2&lt;br /&gt;...&lt;br /&gt;*** Parsing failed for: "is at least one of them narrower than the one which i told you to pick up", line 6&lt;br /&gt;Words not in current vocabulary: [told]&lt;br /&gt;.....&lt;br /&gt;*** Parsing failed for: "will you please stack up both of the red blocks and either a green cube or a pyramid", line 12&lt;br /&gt;Words not in current vocabulary: [either]&lt;br /&gt;...&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-2083694562201515597?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/2083694562201515597/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=2083694562201515597' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/2083694562201515597'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/2083694562201515597'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2009/05/showing-missing-vocabulary-in.html' title='Showing missing vocabulary in EBL_TREEBANK'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-8534902653105383024</id><published>2008-12-19T08:53:00.000-08:00</published><updated>2008-12-19T09:12:30.492-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='top-level'/><category scheme='http://www.blogger.com/atom/ns#' term='parsing'/><title type='text'>Parsing top-level constituents and treebank caching</title><content type='html'>Since parsing with non-top constituents had a bad effect on efficiency, the LOAD command has gone back to loading the normal grammar, as it did before. If you want to be able to parse with non-top constituents, I have introduced a new command, LOAD_DEBUG, which loads an extended version of the grammar suitable for debugging. You are advised not to use this for creating specialised grammars.&lt;br /&gt;&lt;br /&gt;The initial version of this functionality turned out to have a rather nasty bug, which is I think the thing that got Beth Ann yesterday... there was an incorrect interaction between LOAD_DEBUG, grammar caching and treebank caching. This meant that treebank creation sometimes incorrectly thought that the grammar rules had been changed when in fact they hadn't, and unnecessarily reparsed the training corpus. Beth Ann, please update from Regulus and see if this is now fixed!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-8534902653105383024?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/8534902653105383024/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=8534902653105383024' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/8534902653105383024'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/8534902653105383024'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/12/loaddebug-and-treebank-caching.html' title='Parsing top-level constituents and treebank caching'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-143129979276114526</id><published>2008-10-28T23:02:00.000-07:00</published><updated>2008-10-28T23:07:31.481-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='dynamic lexicon'/><title type='text'>Improvement to dynamic lexicon functionality</title><content type='html'>I have just checked in some improvements to dynamic lexicons, which should considerably reduce the number of external files created at runtime. Hopefully this will improve recognition response times, but so far I don't have a non-trivial dynamic lexicon application to test on - so I would appreciate feedback from the Ford project. In particular, please let me know at once if anything appears to be broken. If necessary, you can reverse the change by reverting the file $REGULUS/Prolog/dynamic_lexicon.pl to the previous version.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-143129979276114526?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/143129979276114526/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=143129979276114526' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/143129979276114526'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/143129979276114526'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/10/improvement-to-dynamic-lexicon.html' title='Improvement to dynamic lexicon functionality'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-1795392020858021260</id><published>2008-10-25T23:56:00.000-07:00</published><updated>2008-10-26T00:01:40.365-07:00</updated><title type='text'>Default parse preferences for specialised grammars</title><content type='html'>Following a discussion with Pierrette, I have added default parse preferences for specialised grammars, based on the geometric mean of the rule frequencies as observed in the training corpus. This is what we have been doing for some time in generation. To get the new functionality, you need to update Regulus and remake the specialised grammar you are using. Most of the time, you shouldn't notice anything new, except that the rule frequencies will be displayed in the parse trees, as in the following example:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&gt;&gt; is it a sharp pain&lt;br /&gt;(Parsing with left-corner parser)&lt;br /&gt;&lt;br /&gt;Analysis time: 0.12 seconds&lt;br /&gt;&lt;br /&gt;Return value: [(object=[adj,sharp]), (agent=[pronoun,it]), (object=[secondary_symptom,pain]),&lt;br /&gt;               (null=[tense,present]), (null=[utterance_type,ynq]), (null=[verb,be]),&lt;br /&gt;               (null=[voice,active])]&lt;br /&gt;&lt;br /&gt;Global value: []&lt;br /&gt;&lt;br /&gt;Syn features: []&lt;br /&gt;&lt;br /&gt;Parse tree:&lt;br /&gt;&lt;br /&gt;.MAIN (freq 836) [MED_ROLE_MARKED_SPECIALISED_DEFAULT:2629-3470]&lt;br /&gt;   top (freq 830) [MED_ROLE_MARKED_SPECIALISED_DEFAULT:3471-4306]&lt;br /&gt;      utterance (freq 622) [MED_ROLE_MARKED_SPECIALISED_DEFAULT:4307-4934]&lt;br /&gt;         s (freq 31) [MED_ROLE_MARKED_SPECIALISED_DEFAULT:11277-11313]&lt;br /&gt;         /  vbar (freq 461) [MED_ROLE_MARKED_SPECIALISED_DEFAULT:5947-6413]&lt;br /&gt;         |  /  v lex(is) (freq 39) [MED_ROLE_MARKED_SPECIALISED_DEFAULT:10819-10863]&lt;br /&gt;         |  |  np (freq 314) [MED_ROLE_MARKED_SPECIALISED_DEFAULT:7213-7532]&lt;br /&gt;         |  \     pronoun lex(it) (freq 53) [MED_ROLE_MARKED_SPECIALISED_DEFAULT:10336-10394]&lt;br /&gt;         |  tmp_cat_12 (freq 31) [MED_ROLE_MARKED_SPECIALISED_DEFAULT:11314-11317]&lt;br /&gt;         |  /  np (freq 1153) [MED_ROLE_MARKED_SPECIALISED_DEFAULT:1622-2628]&lt;br /&gt;         |  |  /  np (freq 63) [MED_ROLE_MARKED_SPECIALISED_DEFAULT:9998-10066]&lt;br /&gt;         |  |  |  /  d lex(a) (freq 86) [MED_ROLE_MARKED_SPECIALISED_DEFAULT:9110-9201]&lt;br /&gt;         |  |  |  |  tmp_cat_6 (freq 63) [MED_ROLE_MARKED_SPECIALISED_DEFAULT:10067-10070]&lt;br /&gt;         |  |  |  |  /  adj lex(sharp) (freq 6) [MED_ROLE_MARKED_SPECIALISED_DEFAULT:14728-14739]&lt;br /&gt;         |  |  |  \  \  n lex(pain) (freq 389) [MED_ROLE_MARKED_SPECIALISED_DEFAULT:6818-7212]&lt;br /&gt;         |  |  \  post_mods null (freq 1399) [MED_ROLE_MARKED_SPECIALISED_DEFAULT:615-1621]&lt;br /&gt;         \  \  post_mods null (freq 1399) [MED_ROLE_MARKED_SPECIALISED_DEFAULT:615-1621]&lt;br /&gt;&lt;br /&gt;------------------------------- FILES -------------------------------&lt;br /&gt;&lt;br /&gt;MED_ROLE_MARKED_SPECIALISED_DEFAULT: c:/cygwin/home/speech/speechtranslation/medslt2/eng/generatedfiles/med_role_marked_specialised_default.regulus&lt;br /&gt;&lt;br /&gt;Preference information:&lt;br /&gt;&lt;br /&gt;1.80  Rule frequency score&lt;br /&gt;Total preference score: 1.80&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;The bad news: I was hoping this would solve an annoying problem in Eng/Spa bidirectional. Unfortunately, it doesn't seem to do that. No idea why this used to work, in fact!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-1795392020858021260?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/1795392020858021260/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=1795392020858021260' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/1795392020858021260'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/1795392020858021260'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/10/default-parse-preferences-for.html' title='Default parse preferences for specialised grammars'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-4084567594270840375</id><published>2008-10-12T18:47:00.000-07:00</published><updated>2008-10-12T18:51:45.729-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='top-level'/><title type='text'>Parsing non-top constituents (continued)</title><content type='html'>I have now checked in an improved version of the functionality for parsing non-top constituents, which hides the dummy rules and shows the features for the constituent. Here are a couple of examples from Toy1Specialised:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&gt;&gt; np the light in the kitchen&lt;br /&gt;(Parsing with left-corner parser)&lt;br /&gt;&lt;br /&gt;Analysis time: 0.55 seconds&lt;br /&gt;&lt;br /&gt;Return value: [[device,light],[location,kitchen],[prep,in_loc],[spec,the_sing]]&lt;br /&gt;&lt;br /&gt;Global value: []&lt;br /&gt;&lt;br /&gt;Syn features: [agr=3/\sing,case=A,conj=n,def=y,gapsin=B,gapsout=B,n_appositive_mod_type=none,&lt;br /&gt;n_of_mod_type=none,nform=normal,pronoun=n,sem_n_type=dimmable\/switchable,&lt;br /&gt;syn_type=np_with_noun,takes_about_pp=n,takes_attrib_pp=n,takes_cost_pp=n,&lt;br /&gt;takes_date_pp=n,takes_duration_pp=n,takes_frequency_pp=n,takes_from_pp=n,&lt;br /&gt;takes_loc_pp=n,takes_partitive=n,takes_passive_by_pp=none,takes_post_mods=n,&lt;br /&gt;takes_side_pp=n,takes_time_pp=n,takes_to_pp=n,takes_with_pp=n,wh=n]&lt;br /&gt;&lt;br /&gt;Parse tree:&lt;br /&gt;&lt;br /&gt;np [GENERAL_ENG:2026-2044]&lt;br /&gt;/  np [GENERAL_ENG:1864-1874]&lt;br /&gt;|  /  d lex(the) [GEN_ENG_LEX:341-344]&lt;br /&gt;|  |  nbar [GENERAL_ENG:2071-2083]&lt;br /&gt;|  \     n lex(light) [TOY1_LEX:44-47]&lt;br /&gt;|  post_mods [GENERAL_ENG:1591-1680]&lt;br /&gt;|  /  pp [GENERAL_ENG:1747-1765]&lt;br /&gt;|  |  /  p lex(in) [TOY1_LEX:51-58]&lt;br /&gt;|  |  |  np [GENERAL_ENG:2026-2044]&lt;br /&gt;|  |  |  /  np [GENERAL_ENG:1864-1874]&lt;br /&gt;|  |  |  |  /  d lex(the) [GEN_ENG_LEX:341-344]&lt;br /&gt;|  |  |  |  |  nbar [GENERAL_ENG:2071-2083]&lt;br /&gt;|  |  |  |  \     n lex(kitchen) [TOY1_LEX:38-39]&lt;br /&gt;|  |  \  \  post_mods null [GENERAL_ENG:1410-1416]&lt;br /&gt;\  \  post_mods null [GENERAL_ENG:1410-1416]&lt;br /&gt;&lt;br /&gt;------------------------------- FILES -------------------------------&lt;br /&gt;GENERAL_ENG: c:/cygwin/home/speech/regulus/grammar/general_eng.regulus&lt;br /&gt;GEN_ENG_LEX: c:/cygwin/home/speech/regulus/grammar/gen_eng_lex.regulus&lt;br /&gt;TOY1_LEX:    c:/cygwin/home/speech/regulus/examples/toy1specialised/regulus/toy1_lex.regulus&lt;br /&gt;&lt;br /&gt;&gt;&gt; n light&lt;br /&gt;(Parsing with left-corner parser)&lt;br /&gt;&lt;br /&gt;Analysis time: 0.02 seconds&lt;br /&gt;&lt;br /&gt;Return value: [[device,light]]&lt;br /&gt;&lt;br /&gt;Global value: []&lt;br /&gt;&lt;br /&gt;Syn features: [agr=3/\sing,conj=n,n_appositive_mod_type=none,n_of_mod_type=none,&lt;br /&gt;n_post_mod_type=none,n_pre_mod_type=loc,sem_n_type=dimmable\/switchable,&lt;br /&gt;takes_about_pp=n,takes_attrib_pp=n,takes_cost_pp=n,takes_date_pp=n,takes_det_type=def,&lt;br /&gt;takes_duration_pp=n,takes_frequency_pp=n,takes_from_pp=n,takes_loc_pp=y,&lt;br /&gt;takes_partitive=n,takes_passive_by_pp=none,takes_side_pp=n,takes_time_pp=n,&lt;br /&gt;takes_to_pp=n,takes_with_pp=n]&lt;br /&gt;&lt;br /&gt;Parse tree:&lt;br /&gt;&lt;br /&gt;n lex(light) [TOY1_LEX:44-47]&lt;br /&gt;&lt;br /&gt;------------------------------- FILES -------------------------------&lt;br /&gt;&lt;br /&gt;TOY1_LEX: c:/cygwin/home/speech/regulus/examples/toy1specialised/regulus/toy1_lex.regulus&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-4084567594270840375?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/4084567594270840375/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=4084567594270840375' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/4084567594270840375'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/4084567594270840375'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/10/parsing-non-top-constituents-continued.html' title='Parsing non-top constituents (continued)'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-5726370748221947</id><published>2008-10-09T18:34:00.000-07:00</published><updated>2008-10-09T18:45:04.929-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='top-level'/><category scheme='http://www.blogger.com/atom/ns#' term='parsing'/><title type='text'>Parsing non-top constituents</title><content type='html'>Following a conversation with Pierrette last week, I realised that there was an easy way to fix things so that we can parse non-top constituents in the LC (normal) parser, as well as the DCG one. I have just checked in a first version of the new functionality. Now, when you load a grammar using the LOAD command, an extra file of dummy rules is created and added to the ones explicitly specified. There is one dummy rule for each category Cat in the grammar, of the form&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;dummy_top:[sem=Sem] --&gt; Cat, Cat:[sem=Sem]&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;For example, the dummy rule for 'np' is&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;dummy_top:[sem=Sem] --&gt; np, np:[sem=Sem]&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;What this means is that you can now parse NPs at top-level by simply prefacing them with the word 'np'. Thus for instance in Calendar we can do things like the following:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&gt;&gt; np the last meeting in geneva&lt;br /&gt;(Parsing with left-corner parser)&lt;br /&gt;&lt;br /&gt;Analysis time: 0.97 seconds&lt;br /&gt;&lt;br /&gt;Return value: [[at_loc,[[spec,name],[head,geneva]]],[head,meeting],[spec,the_last]]&lt;br /&gt;&lt;br /&gt;Global value: []&lt;br /&gt;&lt;br /&gt;Syn features: []&lt;br /&gt;&lt;br /&gt;Parse tree:&lt;br /&gt;&lt;br /&gt;.MAIN [CALENDAR_DUMMY_TOP_LEVEL_RULES:1-1]&lt;br /&gt;   dummy_top [CALENDAR_DUMMY_TOP_LEVEL_RULES:22-22]&lt;br /&gt;   /  lex(np)&lt;br /&gt;   |  np [GENERAL_ENG:2026-2044]&lt;br /&gt;   |  /  np [GENERAL_ENG:1849-1863]&lt;br /&gt;   |  |  /  d lex(the) lex(last) [GEN_ENG_LEX:355-355]&lt;br /&gt;   |  |  |  nbar [GENERAL_ENG:2071-2083]&lt;br /&gt;   |  |  \     n lex(meeting) [CALENDAR_LEX:88-89]&lt;br /&gt;   |  |  post_mods [GENERAL_ENG:1591-1680]&lt;br /&gt;   |  |  /  pp [GENERAL_ENG:1747-1765]&lt;br /&gt;   |  |  |  /  p lex(in) [CALENDAR_LEX:151-151]&lt;br /&gt;   |  |  |  |  np [GENERAL_ENG:1955-1963]&lt;br /&gt;   |  |  |  \     name lex(geneva) [GENERATED_NAMES:41-41]&lt;br /&gt;   \  \  \  post_mods null [GENERAL_ENG:1410-1416]&lt;br /&gt;&lt;br /&gt;------------------------------- FILES -------------------------------&lt;br /&gt;&lt;br /&gt;CALENDAR_DUMMY_TOP_LEVEL_RULES: c:/cygwin/home/speech/regulus/examples/calendar/generated/calendar_dummy_top_level_rules.regulus&lt;br /&gt;CALENDAR_LEX:                   c:/cygwin/home/speech/regulus/examples/calendar/regulus/calendar_lex.regulus&lt;br /&gt;GENERAL_ENG:                    c:/cygwin/home/speech/regulus/grammar/general_eng.regulus&lt;br /&gt;GENERATED_NAMES:                c:/cygwin/home/speech/regulus/examples/calendar.regulus&lt;br /&gt;GEN_ENG_LEX:                    c:/cygwin/home/speech/regulus/grammar/gen_eng_lex.regulus&lt;br /&gt;&lt;br /&gt;Semantic triples: []&lt;br /&gt;&lt;br /&gt;No preferences apply&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;I should be able to improve this a little, in particular by adding some functionality to display the features on the non-top  constituent as well as the semantics, but hopefully the existing version will already be quite useful.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-5726370748221947?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/5726370748221947/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=5726370748221947' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/5726370748221947'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/5726370748221947'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/10/parsing-non-top-constituents.html' title='Parsing non-top constituents'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-2657990714421090756</id><published>2008-09-18T02:06:00.000-07:00</published><updated>2008-09-18T03:15:35.821-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='dynamic lexicon'/><title type='text'>Dynamic Regulus lexicon entries</title><content type='html'>Regulus now includes an interface to Nuance dynamic grammar capabilities, making it possible in effect to add new lexicon entries at runtime. Dynamic lexicon entries need to be defined using macros which have been declared dynamic in the Regulus source file.&lt;br /&gt;&lt;br /&gt;I have checked in a sample application in $REGULUS/Examples/Toy1SpecialisedDynamic; there is basic documentation in doc/README.txt. The application uses a version of the Toy1Specialised grammar in which commands need to be prefaced by a name. The user can dynamically add new names to the recognition vocabulary while the application is running. The following extract from the lexicon file shows the macro and declaration for the dynamic name entries:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;macro(person_name(Surface, Sem),&lt;br /&gt;      @name(Surface, [Sem], [agent], sing, [])).&lt;br /&gt;&lt;br /&gt;dynamic_lexicon( @person_name(Surface, Sem) ).&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;At runtime, new name entries can be added using calls to the predicate assert_dynamic_lex_entry/1. A typical call might look like this:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;assert_dynamic_lex_entry( @person_name((howard, the, duck), howard_the_duck))&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Note that the infrastructure needed to run dynamic applications is somewhat different from the standard one. In particular, it is necessary to use a Resource Manager and a Compilation Server,and compile a dummy "just-in-time" recognition package. The sample application gives examples of the scripts required. I will be checking in proper documentation soon, andwill post again when I have done that.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-2657990714421090756?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/2657990714421090756/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=2657990714421090756' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/2657990714421090756'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/2657990714421090756'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/09/dynamic-regulus-lexicon-entries.html' title='Dynamic Regulus lexicon entries'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-3114987558043556044</id><published>2008-09-11T01:13:00.000-07:00</published><updated>2008-09-11T01:37:13.083-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='grammar specialisation'/><title type='text'>Incremental treebanking for grammar specialisation</title><content type='html'>I have just checked in some new code, which should make the process of creating a specialised grammar much more efficient. The most time-consuming part of the process is parsing the treebank, using the EBL_TREEBANK command, or commands like EBL_ANALYSIS which call it indirectly. Until now, the whole set of training sentences had to be parsed every time. This was wasteful, since the greater part of the parses in the existing treebank were often still valid.&lt;br /&gt;&lt;br /&gt;The new functionality improves the picture by trying to determine which parses can be kept, and only reparsing the remaining ones. The current rules for determining which new parses are required are as follows:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;After each invocation of EBL_TREEBANK, Regulus saves both the treebank and a copy of the grammar used to create it. The next time EBL_TREEBANK is called, the system compares the saved grammar and treebank with the current grammar and training corpus.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;The grammar comparison determines two things: 1) Have any non-lexical rules changed? 2) If only lexical rules have changed, which lexical items are affected?&lt;/li&gt;&lt;li&gt;If non-lexical rules have changed, the whole treebank needs to be reparsed. Most often, however, this is not the case. If no rules, or only lexical rules, have changed, the treebank is incrementally updated as follows.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Any items in the treebank which correspond to sentences no longer in the current training corpus are removed.&lt;/li&gt;&lt;li&gt;Any items in the current training corpus which do not occur in the old treebank are parsed and added to the new treebank.&lt;/li&gt;&lt;li&gt;Any items in the treebank which include changed lexical items are reparsed and added to the new treebank.&lt;/li&gt;&lt;li&gt;All remaining items in the old treebank are kept.&lt;/li&gt;&lt;/ul&gt;You need to update Regulus to get the new functionality. Note that nothing will happen the first time you do EBL_TREEBANK after the update, since the old copy of the grammar is saved after EBL_TREEBANK is invoked, and you will not originally have an old saved grammar. So you will only notice a difference the second time you do EBL_TREEBANK.&lt;br /&gt;&lt;br /&gt;I have done some testing, and things appear OK, but I know from experience that this kind of non-monotonic code often contains subtle bugs which aren't immediately apparent. Please let me know if things don't work as expected, and I will give priority to sorting out problems. If necessary, you can toggle the incremental treebanking functionality using the new commands INCREMENTAL_TREEBANKING_OFF and INCREMENTAL_TREEBANKING_ON. By default, incremental treebanking is on.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-3114987558043556044?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/3114987558043556044/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=3114987558043556044' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/3114987558043556044'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/3114987558043556044'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/09/incremental-treebanking-for-grammar.html' title='Incremental treebanking for grammar specialisation'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-2782624496048302248</id><published>2008-07-24T04:40:00.000-07:00</published><updated>2008-07-24T04:43:03.140-07:00</updated><title type='text'>Documentation for NUANCE_PARSER command</title><content type='html'>I've added some basic documentation for the new NUANCE_PARSER command. Here's what you get when you access it using DOC:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&gt;&gt; DOC NUANCE_PARSER&lt;br /&gt;(Print documentation for command or config file entry)&lt;br /&gt;&lt;br /&gt;NUANCE_PARSER&lt;br /&gt;[Brief doc: Start new Nuance nl-tool process and use it as parser]&lt;br /&gt;&lt;br /&gt;Start an nl-tool process, and use it to do parsing. Any old nl-tool processes&lt;br /&gt;are first killed. The current config file needs to include either a&lt;br /&gt;dialogue_rec_params declaration (for dialogue apps) or a translation_rec_params&lt;br /&gt;declaration (for speech translation apps); the declaration must&lt;br /&gt;contain definitions for 'package' and 'grammar'. The following is a&lt;br /&gt;typical example of a suitable declaration:&lt;br /&gt;&lt;br /&gt;regulus_config(dialogue_rec_params,&lt;br /&gt;               [package=calendar_runtime(recogniser), grammar='.MAIN',&lt;br /&gt;                'rec.Pruning=1600', 'rec.DoNBest=TRUE', 'rec.NumNBest=6']).&lt;br /&gt;&lt;br /&gt;Notes:&lt;br /&gt;&lt;br /&gt;-  After NUANCE_PARSER is successfully invoked, nl-tool is used for&lt;br /&gt;ALL parsing, including batch processing with commands like TRANSLATE_CORPUS&lt;br /&gt;and Prolog calls to parse_with_current_parser/6.&lt;br /&gt;-  The Nuance parser only returns logical forms, not parse trees.&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-2782624496048302248?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/2782624496048302248/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=2782624496048302248' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/2782624496048302248'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/2782624496048302248'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/07/documentation-for-nuanceparser-command.html' title='Documentation for NUANCE_PARSER command'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-4688900462031254136</id><published>2008-07-17T05:53:00.000-07:00</published><updated>2008-07-17T05:55:10.819-07:00</updated><title type='text'>More improvements to Nuance documentation</title><content type='html'>You can now access documentation about config file entries from the Regulus top level. The new command HELP_CONFIG looks for information about config file entries, and DOC shows documentation for both commands and config entries. Here's an example:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&gt;&gt; HELP_CONFIG nuance&lt;br /&gt;(Print help for config file entries whose name or description match the string)&lt;br /&gt;&lt;br /&gt;7 config file entries matching "nuance":&lt;br /&gt;&lt;br /&gt;ebl_nuance_grammar&lt;br /&gt;nuance_compile_params&lt;br /&gt;nuance_grammar&lt;br /&gt;nuance_grammar_for_compilation&lt;br /&gt;nuance_grammar_for_pcfg_training&lt;br /&gt;nuance_language_pack&lt;br /&gt;nuance_recognition_package&lt;br /&gt;&lt;br /&gt;&gt;&gt; DOC nuance_grammar&lt;br /&gt;(Print documentation for command or config file entry)&lt;br /&gt;&lt;br /&gt;nuance_grammar&lt;br /&gt;Points to the Nuance GSL grammar produced by the NUANCE command.&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-4688900462031254136?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/4688900462031254136/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=4688900462031254136' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/4688900462031254136'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/4688900462031254136'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/07/more-improvements-to-nuance.html' title='More improvements to Nuance documentation'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-6168984449478731268</id><published>2008-07-16T07:11:00.000-07:00</published><updated>2008-07-16T07:27:42.908-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='documentation'/><category scheme='http://www.blogger.com/atom/ns#' term='Cookbook'/><title type='text'>Improvements to Regulus documentation</title><content type='html'>I have been doing some work on and off over the last few weeks to try and improve the Regulus documentation. It's one of those "important non-urgent" tasks that is very hard to schedule, because you always feel you have something that should take higher priority, but I do finally seem to have made some concrete progress.&lt;br /&gt;&lt;br /&gt;There are three parts to the work, which are meant to be closely interlinked. First, I have created a directory under Regulus/doc called CommandDoc, which is supposed to contain one short file for each command and type of config file entry. I've so far populated it with the information in RegulusDoc.html, which certainly confirmed that RegulusDoc is badly out of date... I'm afraid half the files are currently empty.&lt;br /&gt;&lt;br /&gt;Second, I have added a new top-level Regulus command called DOC. If you type DOC followed by the name of a command, you get the CommandDoc file printed out in a reasonably readable way. For example:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&gt;&gt; DOC LOAD_DIALOGUE&lt;br /&gt;(Print documentation for command)&lt;br /&gt;&lt;br /&gt;LOAD_DIALOGUE&lt;br /&gt;[Brief doc: Load dialogue-related files]&lt;br /&gt;Compile the files defined by the dialogue_files config file entry.&lt;br /&gt;&lt;br /&gt;&gt;&gt; DOC EBL_NUANCE&lt;br /&gt;(Print documentation for command)&lt;br /&gt;&lt;br /&gt;EBL_NUANCE&lt;br /&gt;[Brief doc: Compile current specialised Regulus grammar into Nuance GSL form]&lt;br /&gt;Compile current specialised Regulus grammar into Nuance GSL form. Same&lt;br /&gt;as the NUANCE command, but for the specialised grammar. The input is&lt;br /&gt;the file created by the EBL_POSTPROCESS command; the output Nuance GSL&lt;br /&gt;grammar is placed in the file defined by the ebl_nuance_grammar config&lt;br /&gt;file entry.&lt;br /&gt;&lt;br /&gt;&gt;&gt; DOC TRANSLATE_CORPUS&lt;br /&gt;(Print documentation for command)&lt;br /&gt;&lt;br /&gt;TRANSLATE_CORPUS&lt;br /&gt;[Brief doc: Process text translation corpus]&lt;br /&gt;&lt;br /&gt;Process the default text mode translation corpus, defined by the&lt;br /&gt;translation_corpus config file entry. The output file, defined by&lt;br /&gt;the translation_corpus_results config file entry, contains&lt;br /&gt;question marks for translations that have not yet been judged. If&lt;br /&gt;these are replaced by valid judgements, currently 'good', 'ok' or&lt;br /&gt;'bad', the new judgements can be incorporated into the translation&lt;br /&gt;judgements file (defined by the translation_corpus_judgements&lt;br /&gt;config file entry) using the command&lt;br /&gt;UPDATE_TRANSLATION_JUDGEMENTS.&lt;br /&gt;&lt;br /&gt;TRANSLATE_CORPUS &amp;lt;Arg&amp;gt;&lt;br /&gt;[Brief doc: Process text translation corpus with specified ID]&lt;br /&gt;&lt;br /&gt;Parameterised version of TRANSLATE_CORPUS. Process the text mode&lt;br /&gt;translation corpus with ID &amp;lt;Arg&amp;gt;, defined by the&lt;br /&gt;parameterised config file entry&lt;br /&gt;translation_corpus(&amp;lt;Arg&amp;gt;). The output file, defined&lt;br /&gt;by the parameterised config file entry&lt;br /&gt;translation_corpus_results(&amp;lt;Arg&amp;gt;), contains&lt;br /&gt;question marks for translations that have not yet been judged. If&lt;br /&gt;these are replaced by valid judgements, currently 'good', 'ok' or&lt;br /&gt;'bad', the new judgements can be incorporated into the translation&lt;br /&gt;judgements file (defined by the translation_corpus_judgements&lt;br /&gt;config file entry) using the parameterised command&lt;br /&gt;UPDATE_TRANSLATION_JUDGEMENTS &amp;lt;Arg&amp;gt;.&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Third, and last, I have also arrange things so that the doc files are automatically included in the new Cookbook. This is still just a skeleton, but my plan is to start by completing the command and config-file section, so that the book will immediately be useful for something, and then work outwards from there. The PDF version is checked in as Regulus/doc/Cookbook/draft_cookbook.pdf.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-6168984449478731268?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/6168984449478731268/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=6168984449478731268' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/6168984449478731268'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/6168984449478731268'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/07/improvements-to-regulus-documentation.html' title='Improvements to Regulus documentation'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-5281870917049918575</id><published>2008-07-13T11:18:00.000-07:00</published><updated>2008-07-13T11:42:22.017-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='help'/><title type='text'>Substitutable help classes</title><content type='html'>Following a discussion with Nikos last month, I've added a new feature to the help system, so that help examples can be modified to be closer to the recognition result. Recall that the help system assumes that the designer will have declared a set of help classes; each class C defines a set of phrases P(C). When choosing a help match, both the recognition result and the help examples are backed off so that, for each class C, phrases in P(C) are replaced by C.&lt;br /&gt;&lt;br /&gt;The new functionality I've just added makes it possible to declare some help classes as "substitutable". Suppose that class C is defined as substitutable, that the phrase P1 in the recognition result is backed off to C, and that the phrase P2 in a matched help example H is also backed off to C. In this case, H will not be presented in its original form, but with P2 substituted by P1. Evidently, not all help classes can be defined as substitutable, since it's essential that all the words in a substitutable class have exactly the same syntactic properties.&lt;br /&gt;&lt;br /&gt;There are however some important classes which can in general be made substitutable, in particular (at least in English) names for specific types of individual, plural numbers, days of the week and months of the year. I've tested the new functionality on the Calendar app, and it does indeed seem to give considerably more useful responses. Here's an example. Without substitutable classes, the sentence "what meetings has nikos been to" gets the help responses&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;#1 : "what meetings has pierrette attended in geneva"&lt;br /&gt;     "what meeting_noun has person_name attend_verb preposition loc_name" (backed off)&lt;br /&gt;#2 : "which meetings has elisabeth attended"&lt;br /&gt;     "which meeting_noun has person_name attend_verb" (backed off)&lt;br /&gt;#3 : "what meetings is pierrette going to attend in geneva"&lt;br /&gt;     "what meeting_noun is person_name going to attend_verb preposition loc_name" (backed off)&lt;br /&gt;#4 : "what meetings is pierrette going to attend"&lt;br /&gt;     "what meeting_noun is person_name going to attend_verb" (backed off)&lt;br /&gt;#5 : "what meetings have there been in geneva"&lt;br /&gt;     "what meeting_noun have there been preposition loc_name" (backed off)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;With substitutable classes (person_name is one of them), the response is&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;#1 : "what meetings has nikos attended in geneva"&lt;br /&gt;     "what meeting_noun has person_name attend_verb preposition loc_name" (backed off)&lt;br /&gt;#2 : "which meetings has nikos attended"&lt;br /&gt;     "which meeting_noun has person_name attend_verb" (backed off)&lt;br /&gt;#3 : "what meetings is nikos going to attend in geneva"&lt;br /&gt;     "what meeting_noun is person_name going to attend_verb preposition loc_name" (backed off)&lt;br /&gt;#4 : "what meetings is nikos going to attend"&lt;br /&gt;     "what meeting_noun is person_name going to attend_verb" (backed off)&lt;br /&gt;#5 : "what meetings will nikos attend in geneva"&lt;br /&gt;     "what meeting_noun will person_name attend_verb preposition loc_name" (backed off)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;I haven't done any systematic testing, but anecdotally I'm pretty sure that this seems to make help more responsive.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-5281870917049918575?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/5281870917049918575/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=5281870917049918575' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/5281870917049918575'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/5281870917049918575'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/07/substitutable-help-classes.html' title='Substitutable help classes'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-733733302627918009</id><published>2008-07-11T08:26:00.000-07:00</published><updated>2008-07-11T08:39:47.999-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='AFF'/><category scheme='http://www.blogger.com/atom/ns#' term='MedSLT'/><category scheme='http://www.blogger.com/atom/ns#' term='bidirectional'/><title type='text'>MedSLT almost fully converted to AFF</title><content type='html'>I've made a lot of progress this week on converting the bidirectional English/Spanish version of MedSLT to &lt;a href="http://www.issco.unige.ch/pub/COLING2008RoleMarked.pdf"&gt;AFF format&lt;/a&gt;. I've parameterized the Spanish system to support AFF representations, and added AFF versions of most of the necessary config file and scripts. In particular,&lt;br /&gt;&lt;ul&gt;&lt;li&gt;You can build a full AFF version of Spa by doing 'make role_marked' in Spa/scripts.&lt;/li&gt;&lt;li&gt;You can run interactive bidirectional AFF Eng/Spa text systems using the files load_bidirectional_role_marked.pl and load_bidirectional_restricted_role_marked.pl in EngSpa/scripts.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;There are targets in EngSpa/scripts/Makefile for running AFF versions of the QA corpus, both plain and restricted, with the obvious naming conventions.&lt;/li&gt;&lt;/ul&gt;I've done preliminary testing, and everything should be checked in. There are still things missing, e.g. nothing so far in Spa/Spa for checking back-translation, but I figured it would be best at this point to hand over to Pierrette, so that she can refine the rules. I've only made some absolutely minimal changes, enough to check that a few sentences go through.&lt;br /&gt;&lt;br /&gt;When this piece of work is finished, all of MedSLT should be available in AFF format, which means that we'll be able to retire the old linear version and only support one system. I'm hopeful that build times will then be low enough for us to go back to building and testing the system every night, as we used to do.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-733733302627918009?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/733733302627918009/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=733733302627918009' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/733733302627918009'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/733733302627918009'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/07/medslt-almost-fully-converted-to-aff.html' title='MedSLT almost fully converted to AFF'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-5552646323917006134</id><published>2008-07-08T09:13:00.000-07:00</published><updated>2008-07-09T08:00:04.714-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='n-best'/><category scheme='http://www.blogger.com/atom/ns#' term='MedSLT'/><title type='text'>N-best rescoring again</title><content type='html'>[Updated July 9]&lt;br /&gt;&lt;br /&gt;I've just checked in new code that makes it possible to create training material for doing N-best rescoring on speech translation applications - the functionality is basically the same as what we already had for dialogue applications, but there were a number of details that had to be fixed. It seems that the potential for improving performance using N-best rescoring varies considerably between apps. So far, we've looked at the following cases:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Calendar: can already almost halve error rate using rescoring, more should be possible.&lt;/li&gt;&lt;li&gt;Ford app: almost no potential for improvement.&lt;/li&gt;&lt;li&gt;Paideia app: considerable potential for improvement (don't currently have figures)&lt;/li&gt;&lt;li&gt;English MedSLT: maximum possible improvement looks like about 10% relative.&lt;/li&gt;&lt;li&gt;French MedSLT: maximum possible improvement about 15-20% relative.&lt;/li&gt;&lt;li&gt;Japanese MedSLT: almost no potential for improvement.&lt;/li&gt;&lt;/ul&gt;The variation in behavior between the different apps is quite surprising. In particular, I don't yet have a good explanation for why the MedSLT languages should be so different.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-5552646323917006134?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/5552646323917006134/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=5552646323917006134' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/5552646323917006134'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/5552646323917006134'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/07/n-best-rescoring-again.html' title='N-best rescoring again'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-5443310056937370884</id><published>2008-06-24T12:59:00.000-07:00</published><updated>2008-06-26T13:12:32.584-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Evaluation'/><title type='text'>McNemar at the word level?</title><content type='html'>I was thinking about our paper for GoTAL, and one thing that's bothered me a little is that we did all the significance testing using SER - the reason was that it's easy to run a McNemar test. However, we got rather bigger improvements in WER, which is really what you would expect from SLMs.&lt;br /&gt;&lt;br /&gt;It seems to me though that you should also be able to do McNemar at the word level. You look at each word in the transcription, and then check each of the two hypotheses you're comparing to see whether they include it. This is a little coarse-grained (you treat each sentence as a bag of words), but I'd guess it would still give interesting results. Shouldn't be at all hard to implement either. If we do an expanded version of the GoTAL paper, I'd definitely like to try this.&lt;br /&gt;&lt;br /&gt;In fact, this idea is so obvious that either it's wrong, or someone must already have thought of it. Any idea which?&lt;br /&gt;&lt;br /&gt;PS Jun 26. Beth Ann pointed out that the proposal as originally formulated only covered deletions, but it's trivial to extend it to do insertions too. More seriously, she wondered if the significance results would always be reliable, given that there may be subtle dependencies. I am really not sure about this, but one way to investigate the idea empirically would be to generate large sets of simulated recognition results using a stochastic process, and look at the distributions. For example, if you generate 10000 simulated recognition runs, then take one run and find all the other runs that come out as different from it at P &lt; 0.01 according to the new statistic, you'd be reassured to find there were not more than 100 of them. A lot more, and something is presumably wrong. A lot less presumably just shows the test isn't very sensitive.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-5443310056937370884?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/5443310056937370884/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=5443310056937370884' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/5443310056937370884'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/5443310056937370884'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/06/mcnemar-at-word-level.html' title='McNemar at the word level?'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-5430722679726765969</id><published>2008-06-24T12:52:00.000-07:00</published><updated>2008-06-26T13:02:52.251-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Regserver'/><category scheme='http://www.blogger.com/atom/ns#' term='efficiency'/><title type='text'>Faster parsing in Regulus using Nuance</title><content type='html'>Here's something I've been meaning to do for a while, that really should be moved up the priority stack. It should be quite easy to arrange things so that, in cases where we have compiled a grammar down to Nuance form, we use Nuance to do parsing - this ought to be much faster than the Regulus parser, and could really let us speed up corpus runs. There are at least two straightforward ways to implement it. One is to start an nl-tool process and pipe sentences into it, reading the analyses that come back. It may be even simpler to use the Regserver, now that we can connect to it from the Regulus top-level, and send an "interpret" message. More about this soon, I hope.&lt;br /&gt;&lt;br /&gt;PS Jun 26. It was indeed very easy - I took the route of creating an nl-tool process and connecting to it with pipes. The new NUANCE_PARSER command now lets you use nl-tool as the parser. Parsing times are at least 30 times faster. Things should be checked in. More about this soon.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-5430722679726765969?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/5430722679726765969/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=5430722679726765969' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/5430722679726765969'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/5430722679726765969'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/06/faster-parsing-in-regulus-using-nuance.html' title='Faster parsing in Regulus using Nuance'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-2607025604869329668</id><published>2008-06-17T16:38:00.000-07:00</published><updated>2008-06-17T16:43:59.708-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='n-best'/><category scheme='http://www.blogger.com/atom/ns#' term='MedSLT'/><title type='text'>"Paraphrase corpora" for estimating semantic error rates</title><content type='html'>I've implemented a first cut at the "paraphrase corpus" idea that I suggested in yesterday's post. So far, it only works for speech translation, but it's rather nice to see that we can now measure the effect that N-best rescoring has on semantic error rate in a way that's both much quicker and much more objective than what we were doing previously. On the whole of the Eng corpus (the only one I've tried so far), semantic error rate on this metric is reduced by N-best rescoring by about 4% absolute, or 8% relative.&lt;br /&gt;&lt;br /&gt;My next task here is to extend the method to dialogue processing - this should be easy, I think. We will then be able to do dialogue N-best rescoring experiments using out-of-coverage as well as in-coverage data, which should open up several new possibilities.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-2607025604869329668?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/2607025604869329668/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=2607025604869329668' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/2607025604869329668'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/2607025604869329668'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/06/paraphrase-corpora-for-estimating.html' title='&quot;Paraphrase corpora&quot; for estimating semantic error rates'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-8111279564951299660</id><published>2008-06-16T13:39:00.000-07:00</published><updated>2008-06-16T13:52:25.899-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='SemER'/><category scheme='http://www.blogger.com/atom/ns#' term='n-best'/><category scheme='http://www.blogger.com/atom/ns#' term='MedSLT'/><title type='text'>Better ways to estimate semantic error rate</title><content type='html'>I've just added some code to automatically estimate semantic error rate for translation applications. It does more or less the same thing as the code we've had for a while in dialogue apps, and counts an example from a speech corpus as semantically correct if it produces the same interlingua as the transcription would have done.&lt;br /&gt;&lt;br /&gt;Unfortunately, the problem with this definition is that it doesn't work for utterances that are in domain, but out of grammar coverage. For example, I was just looking though the results for the English MedSLT corpus. In one example, the transcription is "does the pain ache", which is out of grammar coverage. The first hypothesis which produces well-formed interlingua is "does the pain feel aching", which is a good paraphrase and is selected. So this should really be counted as semantically correct, but isn't.&lt;br /&gt;&lt;br /&gt;I think we can address the problem by allowing the developer to declare a file of paraphrases, and say that the example is semantically correct if it gives the same result as either the actual transcription or one of its paraphrases. Then if the developer adds in-coverage paraphrases where they exist, things will work correctly. This should be easy to implement. Probably we want a warning if a paraphrase in fact is also determined to be out of coverage.&lt;br /&gt;&lt;br /&gt;This paraphrase functionality should also be useful for the N-best rescoring work that Maria and I have been doing for dialogue apps. We have the same problem there - we want to be able to experiment with out of coverage examples, but currently get no figures.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-8111279564951299660?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/8111279564951299660/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=8111279564951299660' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/8111279564951299660'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/8111279564951299660'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/06/better-ways-to-estimate-semantic-error.html' title='Better ways to estimate semantic error rate'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-6556294405272058950</id><published>2008-06-11T08:16:00.000-07:00</published><updated>2008-06-11T08:23:39.870-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='interlingua'/><category scheme='http://www.blogger.com/atom/ns#' term='AFF'/><category scheme='http://www.blogger.com/atom/ns#' term='MedSLT'/><title type='text'>Interlingua corpora for multiple domains</title><content type='html'>Following a discussion with Pierrette last week, I have added two more MedSLT Interlingua corpora, for the chest pain and abdominal pain domains. I've also added all the associated config files, scripts etc for the currently relevant language pairs (EngInt, JapInt, IntEng, IntFre and IntJap), so it should now possible to do systematic interlingua-centered development for all three domains. I have only built AFF versions, since we're planning to retire the linear formalism soon.&lt;br /&gt;&lt;br /&gt;The naming conventions are the usual ones. I hopefully managed to check everything in, but let me know if files that you expected to find are missing. Pierrette should at some point do some work tidying up IntFre and FreInt and Yukie should do the same for IntJap and JapInt. Further down the line, we should really add coverage for these domains in the missing languages.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-6556294405272058950?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/6556294405272058950/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=6556294405272058950' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/6556294405272058950'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/6556294405272058950'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/06/interlingua-corpora-for-multiple.html' title='Interlingua corpora for multiple domains'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-8417931408637070408</id><published>2008-06-10T01:40:00.000-07:00</published><updated>2008-06-10T01:49:44.496-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='AFF'/><category scheme='http://www.blogger.com/atom/ns#' term='Catalan'/><category scheme='http://www.blogger.com/atom/ns#' term='MedSLT'/><title type='text'>AFF version of Catalan</title><content type='html'>I've added initial versions of all the files needed for the AFF version of Catalan in MedSLT. Naming conventions are the usual ones, and I was able to build all the AFF Cat resources by doing&lt;br /&gt;&lt;br /&gt;make role_marked&lt;br /&gt;&lt;br /&gt;in the Cat/scripts directory. There should now be config files for all 5 x 5 = 25 pairs of languages in {Ara, Cat, Eng, Fre, Jap} - this involved adding a few new pairs. I only tested Interlingua to Catalan and Catalan to Interlingua. We get currently translations for about 75% of the sentences in IntCat, and about 20% in CatInt. Hopefully it will be easy to improve these figures.&lt;br /&gt;&lt;br /&gt;Over to Pierrette and Bruna to debug the rules. Note that I have macrotised the Cat lexicon to make the AFF version work. It should be mostly OK, but there were a few cases (in particular, WH+ PPs) where I wasn't quite sure how to do the macrotisation - people who actually know Catalan should review the entries.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-8417931408637070408?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/8417931408637070408/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=8417931408637070408' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/8417931408637070408'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/8417931408637070408'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/06/aff-version-of-catalan.html' title='AFF version of Catalan'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-5160814266752609285</id><published>2008-06-04T06:41:00.000-07:00</published><updated>2008-06-04T06:46:03.512-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sicstus'/><category scheme='http://www.blogger.com/atom/ns#' term='release'/><category scheme='http://www.blogger.com/atom/ns#' term='2.9.0'/><title type='text'>Regulus 2.9.0 released</title><content type='html'>Nikos has just created and uploaded the new 2.9.0 release of Regulus. I tried downloading and running a couple of simple tests in text and speech mode (under SICStus 4.0.3), and Toy1 at least appears to work fine. Please mail me if you notice problems.&lt;br /&gt;&lt;br /&gt;Here are the release notes:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;MAIN CHANGES TO REGULUS BETWEEN 2.8.0 AND 2.9.0&lt;br /&gt;&lt;br /&gt;A large number of new features have been added to Regulus since&lt;br /&gt;version 2.8.0. Most importantly, Regulus now runs under Sicstus 4;&lt;br /&gt;it is possible to use speech input directly from the top-level;&lt;br /&gt;N-best processing is supported in both dialogue and translation mode;&lt;br /&gt;and a new semantics for translation applications has been added.&lt;br /&gt;&lt;br /&gt;The new features are listed below in more detail. Not all of them are&lt;br /&gt;fully documented yet, but we are giving priority to adding the&lt;br /&gt;necessary documentation.&lt;br /&gt;&lt;br /&gt;- Support for Sicstus 4&lt;br /&gt;  - Regulus runs under Sicstus 4.&lt;br /&gt;    - It has been thoroughly tested under 4.0.2.&lt;br /&gt;    - Some testing has been done under 4.0.3, but this has not yet been carefully&lt;br /&gt;      verified. NOTE: under 4.0.3, it is necessary to load the patch files in&lt;br /&gt;      Prolog/SicstusPatches/4.0.3&lt;br /&gt;  - Regulus still runs under Sicstus 3, and has been thoroughly&lt;br /&gt;    tested under 3.12.5.&lt;br /&gt;&lt;br /&gt;- Top-level&lt;br /&gt;  - Errors are now written to stderr&lt;br /&gt;  - There is a version of regulus_batch with an extra argument, which returns&lt;br /&gt;    the list of error outputs created when running the commands.&lt;br /&gt;  - It is possible to compile Nuance grammars from the Regulus top-level&lt;br /&gt;    using the NUANCE_COMPILE command.&lt;br /&gt;  - It is possible to perform speech recognition directly from the top-level&lt;br /&gt;    - The LOAD_RECOGNITION command starts defined speech resources, including&lt;br /&gt;      a license manager, recserver and Regserver&lt;br /&gt;    - After loading resources using LOAD_RECOGNITION, the RECOGNISE command&lt;br /&gt;      takes live speech input and passes it to the current application.&lt;br /&gt;    - Wavfiles are automatically logged by RECOGNISE. The WAVFILES &lt;n&gt; command&lt;br /&gt;      lists the &lt;n&gt; most recent recorded wavfiles.&lt;br /&gt;    - When speech resources are loaded, text input of the form&lt;br /&gt;&lt;br /&gt;      WAVFILE: &lt;wavfile&gt;&lt;br /&gt;&lt;br /&gt;      performs recognition on &lt;wavfile&gt;, and passes the result to the&lt;br /&gt;      current application&lt;br /&gt;&lt;br /&gt;- Java GUI&lt;br /&gt;  - The Java GUI has been greatly improved, and many bugs have been fixed.&lt;br /&gt;  - The GUI supports direct speech input, similar to the Prolog top-level&lt;br /&gt;    described above&lt;br /&gt;  - It is possible to run multiple copies of the GUI at the same time.&lt;br /&gt;&lt;br /&gt;- Stepper&lt;br /&gt;  - The commands LOAD, LOAD_GENERATION, EBL_LOAD and EBL_LOAD_GENERATION&lt;br /&gt;    can be invoked from within the stepper.&lt;br /&gt;&lt;br /&gt;- Support for spoken dialogue applications&lt;br /&gt;  - When speech resources have been loaded from the command line,&lt;br /&gt;    dialogue corpora can contain items of the form wavfile(&lt;wavfile&gt;).&lt;br /&gt;    This makes it possible to test corpora containing a mixture of speech&lt;br /&gt;    and non-speech inputs.&lt;br /&gt;  - Batch processing of speech input in dialogue mode produces figures&lt;br /&gt;    for semantic error rate. An utterance is deemed semantically correct if&lt;br /&gt;    it produces the same dialogue move as the transcription would have done.&lt;br /&gt;  - A timeout has been added in batch dialogue processing, so that processing&lt;br /&gt;    gives up after 10 seconds.&lt;br /&gt;  - If N-best preferences are defined, preference info is printed in&lt;br /&gt;    dialogue mode.&lt;br /&gt;  - Allow dialogue server to take XML-formatted requests&lt;br /&gt;&lt;br /&gt;- Generation&lt;br /&gt;  - When the declaration&lt;br /&gt;&lt;br /&gt;    regulus_config(prolog_semantics, yes).&lt;br /&gt;&lt;br /&gt;    is included, generation grammars can contain arbitrary Prolog structures.&lt;br /&gt;&lt;br /&gt;- Translation&lt;br /&gt;  - There is extensive support for translation using both the original&lt;br /&gt;    "linear" semantics, and also the new "Almost Flat Functional" (AFF)&lt;br /&gt;    semantics. AFF is described in our COLING 2008 paper, which will soon&lt;br /&gt;    posted on the Regulus website. Some initial documentation will be added&lt;br /&gt;    to RegulusDoc.htm.&lt;br /&gt;  - It is possible in a translation config file to define an interlingua&lt;br /&gt;    as either a source or a target language. There are many examples&lt;br /&gt;    in the MedSLT project directory.&lt;br /&gt;  - Batch translation produces output files for judging both in Prolog&lt;br /&gt;    and in CSV form. There are new commands for updating judgements from the&lt;br /&gt;    CSV files.&lt;br /&gt;  - When speech resources have been loaded from the command line,&lt;br /&gt;    translation corpora can contain items of the form wavfile(&lt;wavfile&gt;).&lt;br /&gt;  - A simple version of N-best processing has been added for applications&lt;br /&gt;    that use interlingual translation with an interlingua grammar. In N-best mode,&lt;br /&gt;    the first utterance producing well-formed interlingua is selected.&lt;br /&gt;  - Interlingua expressions ambiguous according to the interlingua grammar&lt;br /&gt;    are flagged in translation mode.&lt;br /&gt;  - If performing batch translation from Source to Target through Interlingua,&lt;br /&gt;    combine available Source -&gt; Interlingua and Interlingua -&gt; Target&lt;br /&gt;    judgements into Source -&gt; Target judgements if possible.&lt;br /&gt;  - Show average number of generated target language surface forms when&lt;br /&gt;    doing batch translation.&lt;br /&gt;  - Translation conditions can include elements of the form&lt;br /&gt;&lt;br /&gt;    context_below(&lt;item&gt;)&lt;br /&gt;&lt;br /&gt;    This matches an &lt;item&gt; in a clause.&lt;br /&gt;&lt;br /&gt;- Grammar specialisation&lt;br /&gt;  - Fix bug in processing of include_lex declarations.&lt;br /&gt;&lt;br /&gt;- Help&lt;br /&gt;  - When defining intelligent help for translation applications, help resources&lt;br /&gt;    can be built from an interlingua corpus.&lt;br /&gt;&lt;br /&gt;- Extension to Regulus grammar formalism&lt;br /&gt;  - Allow =@ as synonym for = @&lt;br /&gt;  - Add runtime support for GSL functions strcat/2, add/2, sub/2, neg/1, mul/2, div/2&lt;br /&gt;&lt;br /&gt;- English grammar&lt;br /&gt;  - Rules for dates including years have been added.&lt;br /&gt;&lt;br /&gt;- Other&lt;br /&gt;  - Tool added to perform random generation from PCFG-trained GSL grammars&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-5160814266752609285?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/5160814266752609285/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=5160814266752609285' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/5160814266752609285'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/5160814266752609285'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/06/regulus-290-released.html' title='Regulus 2.9.0 released'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-6422691107364913957</id><published>2008-06-02T13:23:00.000-07:00</published><updated>2008-06-02T13:40:00.198-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sicstus'/><title type='text'>Problems with SICStus 4.0.3 resolved</title><content type='html'>The SICStus people were as usual very responsive, and we now seem to be OK for running under 4.0.3. However, (this is &lt;span style="font-weight: bold;"&gt;IMPORTANT&lt;/span&gt;), you need to install a couple of patch files if you are using that version of SICStus. So, if you're using 4.0.3, do the following:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Update Regulus from CVS, using the -d option to get new directories.&lt;/li&gt;&lt;li&gt;Copy the files from Prolog/SicstusPatches/4.0.3 to C:/Program Files/SICStus Prolog 4.0.3/library, or wherever you have your copy of SICStus.&lt;/li&gt;&lt;/ul&gt;I will set my default version of SICStus to 4.0.3, which means I'll no doubt test it a fair amount over the next few days. I would not recommend people to switch over to 4.0.3 until I've run with it a while and reported on how it's working.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-6422691107364913957?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/6422691107364913957/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=6422691107364913957' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/6422691107364913957'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/6422691107364913957'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/06/problems-with-sicstus-403-resolved.html' title='Problems with SICStus 4.0.3 resolved'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-6793146180971450624</id><published>2008-06-02T06:28:00.000-07:00</published><updated>2008-06-02T14:49:41.478-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sicstus'/><title type='text'>Problems with SICStus 4.0.3</title><content type='html'>We are unfortunately still having problems with SICStus 4. Things have been more or less stable with 4.0.2, but there were a few rather ugly patches - the SICStus people said things would be better in the next version. Sad to tell, I have just downloaded 4.0.3 and tried it out, and in fact, at least as far as Regulus is concerned, it's gone backwards. Due to new incompatibilities in the operating system interface libraries, it's not currently possible to run Regulus in speech mode with 4.0.3 - there may also be other problems. I can presumably implement a workaround, but the idea of having to patch the code after every new SICStus release makes me very nervous.&lt;br /&gt;&lt;br /&gt;For Prolog people who want the low-level details, here is part of the mail I just sent to the SICStus team:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Unless I am misunderstanding something important, SP4.0.3's version of the&lt;br /&gt;system3 library is still not  downward-compatible with SP3's system library, and is in fact rather&lt;br /&gt;less downward-compatible than SP4.0.2's system3. The problem is now in system/1.&lt;br /&gt;In SP4.0.2, system/1 is defined as follows:&lt;br /&gt;&lt;br /&gt;system(Cmd) :-&lt;br /&gt;  system_binary(Binary, DashC),&lt;br /&gt;  proc_call(Binary, DashC, Cmd, exit(0)).&lt;br /&gt;&lt;br /&gt;so it's possible to make calls like the following, running under Cygwin:&lt;br /&gt;&lt;br /&gt;| ?- system('dir &gt; tmp_dir.txt').&lt;br /&gt;      1      1 Call: system('dir &gt; tmp_dir.txt') ?&lt;br /&gt;      2      2 Call: system3:environ('COMSPEC',_790) ?&lt;br /&gt;      2      2 Exit: system3:environ('COMSPEC','C:\\WINDOWS\\system32\\cmd.exe') ?&lt;br /&gt;      3      2 Call: system3:process_create('C:\\WINDOWS\\system32\\cmd.exe',['/C','dir &gt; tmp_dir.txt'],system3:[process(_1437)]) ?&lt;br /&gt;      3      2 Exit: system3:process_create('C:\\WINDOWS\\system32\\cmd.exe',['/C','dir &gt; tmp_dir.txt'],system3:[process('$process'('$ptr IEDNJP'))]) ?&lt;br /&gt;      4      2 Call: system3:process_wait('$process'('$ptr IEDNJP'),exit(0)) ? s&lt;br /&gt;      4      2 Exit: system3:process_wait('$process'('$ptr IEDNJP'),exit(0)) ?&lt;br /&gt;      1      1 Exit: system('dir &gt; tmp_dir.txt') ?&lt;br /&gt;&lt;br /&gt;Under SP4.0.3, system/1 is defined thus:&lt;br /&gt;&lt;br /&gt;system(Cmd, Status) :-&lt;br /&gt;      shell_exec(Cmd, [], exit(Status)).&lt;br /&gt;&lt;br /&gt;and the corresponding call looks like this:&lt;br /&gt;&lt;br /&gt;| ?- system('dir &gt; tmp_dir.txt').&lt;br /&gt;      1      1 Call: system('dir &gt; tmp_dir.txt') ?&lt;br /&gt;      2      2 Call: system3:system('dir &gt; tmp_dir.txt',0) ?&lt;br /&gt;      3      3 Call: system3:process_create('dir &gt; tmp_dir.txt',[],system3:[commandline(true),process(_1119)]) ?&lt;br /&gt;      3      3 Exit: system3:process_create('dir &gt; tmp_dir.txt',[],system3:[commandline(true),process('$process'('$ptr ALJLOO'))]) ?&lt;br /&gt;      4      3 Call: system3:process_wait('$process'('$ptr ALJLOO'),exit(0)) ?&lt;br /&gt;      4      3 Fail: system3:process_wait('$process'('$ptr ALJLOO'),exit(0)) ?&lt;br /&gt;      2      2 Fail: system3:system('dir &gt; tmp_dir.txt',0) ?&lt;br /&gt;      1      1 Fail: system('dir &gt; tmp_dir.txt') ?&lt;br /&gt;&lt;br /&gt;The problem, as far as I can see, is that process_create requires the first arg&lt;br /&gt;of process_create to be a program, which it isn't here.&lt;br /&gt;&lt;br /&gt;Unfortunately, we have people running Regulus under at least 3.12.5, 4.0.2 and 4.0.3.&lt;br /&gt;Maintaining the code so that it runs under all these different versions is&lt;br /&gt;becoming quite difficult - the operating system interface primitives are&lt;br /&gt;absolutely essential. Advice appreciated.&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-6793146180971450624?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/6793146180971450624/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=6793146180971450624' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/6793146180971450624'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/6793146180971450624'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/06/problems-with-sicstus-403.html' title='Problems with SICStus 4.0.3'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-9167172107213490757</id><published>2008-06-02T03:20:00.000-07:00</published><updated>2008-06-02T03:31:25.959-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='interlingua'/><category scheme='http://www.blogger.com/atom/ns#' term='AFF'/><category scheme='http://www.blogger.com/atom/ns#' term='MedSLT'/><title type='text'>Interlingua corpora</title><content type='html'>Over the last few months, we have been moving MedSLT development towards a new way of doing things, which is based on the idea of an "Interlingua corpus". We present the basic picture in our LREC 2008 paper, but that's already somewhat out of date, and doesn't give any low-level details.&lt;br /&gt;&lt;br /&gt;We now have four interlingua corpora, representing the cross-produce of {linear, AFF} x {plain, combined}. The linear/AFF distinction is  concerned with the type of semantics used. "Linear" is the old MedSLT semantics; AFF semantics is explained in the paper by Pierrette, Beth Ann, Yukie and myself which has just been accepted for COLING 2008, and which will soon be appearing on the Geneva website.&lt;br /&gt;&lt;br /&gt;The plain/combined distinction says what information has been incorporated in the corpus. The "plain" corpus is created by merging results of translating FROM each source language into interlingua, so each interlingua form lists the source language results that translate into it. The "combined" corpus contains all the information in the "plain" corpus, plus also the results of translating TO each target language.&lt;br /&gt;&lt;br /&gt;At the moment, we use the plain corpus for developing translations rules that go from Interlingua to target languages. The combined corpus is used for creating help resources.&lt;br /&gt;&lt;br /&gt;All the scripts used to build interlingua corpora are referenced in $MED_SLT2/Interlingua/scripts/Makefile.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-9167172107213490757?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/9167172107213490757/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=9167172107213490757' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/9167172107213490757'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/9167172107213490757'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/06/interlingua-corpora.html' title='Interlingua corpora'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-7016999541464524197</id><published>2008-06-01T11:49:00.000-07:00</published><updated>2008-06-01T11:55:18.684-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='top-level'/><category scheme='http://www.blogger.com/atom/ns#' term='GUI'/><title type='text'>Running multiple copies of the GUI</title><content type='html'>Elisabeth did a little work over the weekend, and it's now possible to run multiple copies of the GUI simultaneously - this is an important feature that people have been requesting for some time. The solution turns out to be embarrassingly simple. All we needed to do, in the end, was fix things so that it's possible for both the Java and the Prolog processes to specify from the command line which port they use to communicate with each other. As long as different {Java, Prolog} pairs use different ports, they don't interfere with each other.&lt;br /&gt;&lt;br /&gt;I've added an example script to Regulus/Java called run_prolog_and_java2.bat - this is just like run_prolog_and_java.bat, but starts a second pair of processes, communicating over a new port.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-7016999541464524197?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/7016999541464524197/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=7016999541464524197' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/7016999541464524197'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/7016999541464524197'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/06/running-multiple-copies-of-gui.html' title='Running multiple copies of the GUI'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-5681659060625184658</id><published>2008-05-31T04:55:00.001-07:00</published><updated>2008-05-31T05:07:08.815-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Calendar'/><category scheme='http://www.blogger.com/atom/ns#' term='n-best'/><title type='text'>Progress on N-best rescoring</title><content type='html'>Maria Georgescul and I have been doing some work over the last few days on N-best rescoring, using the Calendar application as a test-bed. The basic division of labor was for me to define features and transform N-best hypothesis lists into lists of feature vectors, while Maria fed these into an SVM-based learner to perform the actual rescoring. We did the experiments using a set of 459 recorded utterances. Rescoring now reduces semantic error rate from 19% to 11%, and WER from 11% to 10%.&lt;br /&gt;&lt;br /&gt;I defined the features by looking at examples of N-best lists, and finding common examples of things which I felt intuitively should be penalized. The current set of features is as follows:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;rank: Place in the N-best list&lt;br /&gt;&lt;br /&gt;no_dialogue_move: Hypothesis produces no dialogue move&lt;br /&gt;&lt;br /&gt;underconstrained_query: Query with no contentful constraints&lt;br /&gt;&lt;br /&gt;non_indefinite_existential: Existentials with non-indefinite arg, e.g. "is there the meeting next week"&lt;br /&gt;&lt;br /&gt;non_show_imperative: Imperatives where the main verb isn't "show" or something similar&lt;br /&gt;&lt;br /&gt;indefinite_meeting_and_meeting_referent: combination of indefinite mention of meeting + available meeting referent&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-5681659060625184658?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/5681659060625184658/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=5681659060625184658' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/5681659060625184658'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/5681659060625184658'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/05/progress-on-n-best-rescoring.html' title='Progress on N-best rescoring'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-5748408863431151744</id><published>2008-05-28T10:33:00.000-07:00</published><updated>2008-05-31T05:11:52.135-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='top-level'/><category scheme='http://www.blogger.com/atom/ns#' term='n-best'/><title type='text'>Printing N-best feature info at top level</title><content type='html'>If you're in dialogue mode, and have N-best preferences defined, you now get them printed out at top level. This is useful for debugging feature definitions. Here's an example from the Calendar application:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&gt;&gt; what was the last meeting&lt;br /&gt;&lt;br /&gt;      Old state: [lf=[[whq,form(past,[[be,term(the_last,meeting,[]),[loc,where]]])]], &lt;br /&gt;                  referents=[record(meeting,meeting_10),attribute(meeting,meeting_10,where)]]&lt;br /&gt;             LF: [[whq,form(past,[[be,term(the_last,meeting,[]),term(what,null,[])]])]]&lt;br /&gt;    Resolved LF: [[whq,form(past,[[be,term(the_last,meeting,[]),term(what,null,[])]])]]&lt;br /&gt;     Resolution: [trivial]&lt;br /&gt;  Dialogue move: [tense_information=referent(past), utterance_type=whq, &lt;br /&gt;                  aggregate(last_n_meetings(1),[])]&lt;br /&gt;  Resolved move: [tense_information=interval(datime(1980,0,0,0,0,0),datime(2008,5,28,18,27,24)),&lt;br /&gt;                  utterance_type=whq, aggregate(last_n_meetings(1),[])]&lt;br /&gt;     Paraphrase: list meetings in past the last meeting&lt;br /&gt;Abstract action: say(referent_list([record(meeting,meeting_10)]))&lt;br /&gt;Concrete action: tts(meeting at pierrette 's room on november 25)&lt;br /&gt;      New state: [lf=[[whq,form(past,[[be,term(the_last,meeting,[]),term(what,null,[])]])]], &lt;br /&gt;                  referents=[attribute(meeting,meeting_10,where),record(meeting,meeting_10)]]&lt;br /&gt;&lt;br /&gt;N-BEST FEATURES AND SCORES:&lt;br /&gt;&lt;br /&gt;rank                                    -1.00 * 0.00 = 0.00&lt;br /&gt;no_dialogue_move                        -50.00 * 0.00 = 0.00&lt;br /&gt;underconstrained_query                  -10.00 * 0.00 = 0.00&lt;br /&gt;inconsistent_tense                      -10.00 * 0.00 = 0.00&lt;br /&gt;non_indefinite_existential              -10.00 * 0.00 = 0.00&lt;br /&gt;non_show_imperative                     -50.00 * 0.00 = 0.00&lt;br /&gt;definite_meeting_and_meeting_referent   3.00 * 0.00 = 0.00&lt;br /&gt;&lt;br /&gt;Total score: 0.00&lt;br /&gt;&lt;br /&gt;Dialogue processing time: 0.00 seconds&lt;br /&gt;&lt;br /&gt;&gt;&gt; when did that meeting start&lt;br /&gt;&lt;br /&gt;      Old state: [lf=[[whq,form(past,[[be,term(the_last,meeting,[]),term(what,null,[])]])]], &lt;br /&gt;                  referents=[attribute(meeting,meeting_10,where),record(meeting,meeting_10)]]&lt;br /&gt;             LF: [[whq,form(past,[[start,term(that,meeting,[])],[time,when]])]]&lt;br /&gt;    Resolved LF: [[whq,form(past,[[start,term(that,meeting,[])],[time,when]])]]&lt;br /&gt;     Resolution: [trivial]&lt;br /&gt;  Dialogue move: [query_object=start_time, referent_from_context=meeting, &lt;br /&gt;                  tense_information=referent(past), utterance_type=whq]&lt;br /&gt;  Resolved move: [meeting=meeting_10, query_object=start_time, referent_from_context=meeting, &lt;br /&gt;                  tense_information=interval(datime(1980,0,0,0,0,0),datime(2008,5,28,18,27,35)),&lt;br /&gt;                  utterance_type=whq]&lt;br /&gt;     Paraphrase: start time for that meeting in past&lt;br /&gt;Abstract action: say(referent_list([attribute(meeting,meeting_10,start_time)]))&lt;br /&gt;Concrete action: tts(10 00 on november 25)&lt;br /&gt;      New state: [lf=[[whq,form(past,[[start,term(that,meeting,[])],[time,when]])]], &lt;br /&gt;                  (referents&lt;br /&gt;                   = &lt;br /&gt;                   [attribute(meeting,meeting_10,where), record(meeting,meeting_10), &lt;br /&gt;                    attribute(meeting,meeting_10,start_time)])]&lt;br /&gt;&lt;br /&gt;N-BEST FEATURES AND SCORES:&lt;br /&gt;&lt;br /&gt;rank                                    -1.00 * 0.00 = 0.00&lt;br /&gt;no_dialogue_move                        -50.00 * 0.00 = 0.00&lt;br /&gt;underconstrained_query                  -10.00 * 0.00 = 0.00&lt;br /&gt;inconsistent_tense                      -10.00 * 0.00 = 0.00&lt;br /&gt;non_indefinite_existential              -10.00 * 0.00 = 0.00&lt;br /&gt;non_show_imperative                     -50.00 * 0.00 = 0.00&lt;br /&gt;definite_meeting_and_meeting_referent   3.00 * 1.00 = 3.00&lt;br /&gt;&lt;br /&gt;Total score: 3.00&lt;br /&gt;&lt;br /&gt;Dialogue processing time: 0.01 seconds&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-5748408863431151744?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/5748408863431151744/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=5748408863431151744' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/5748408863431151744'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/5748408863431151744'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/05/printing-n-best-feature-info-at-top.html' title='Printing N-best feature info at top level'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-7615410059660556726</id><published>2008-05-27T14:28:00.000-07:00</published><updated>2008-05-27T14:35:14.819-07:00</updated><title type='text'>Catching Regulus errors</title><content type='html'>Peter Ljunglöf wondered whether error reporting in Regulus could be improved, and had a couple of suggestions. I've implemented and checked in the following improvements:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;All error messages should now be printed to stderr.&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;When processing fails during execution of the Regulus command &lt;command&gt;, a line of the form&lt;br /&gt;&lt;br /&gt;Error processing command: &lt;command&gt;&lt;br /&gt;&lt;br /&gt;should be printed. This was not previously the case.&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;There is a new top-level predicate&lt;br /&gt;&lt;br /&gt;regulus_batch_storing_errors(+ConfigFile, +Commands, -ErrorString)&lt;br /&gt;&lt;br /&gt;which is like regulus_batch/2, except that it instantiates ErrorString with a string containing all the errors printed out during execution of Commands. &lt;/li&gt;&lt;/ol&gt;I expect there will be some glitches (I had to change a lot of lines of code), so please let me know if thing don't work as intended.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-7615410059660556726?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/7615410059660556726/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=7615410059660556726' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/7615410059660556726'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/7615410059660556726'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/05/catching-regulus-errors.html' title='Catching Regulus errors'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-3358322604276685609</id><published>2008-05-27T11:42:00.000-07:00</published><updated>2008-05-27T11:47:11.473-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='help'/><category scheme='http://www.blogger.com/atom/ns#' term='interlingua'/><category scheme='http://www.blogger.com/atom/ns#' term='AFF'/><category scheme='http://www.blogger.com/atom/ns#' term='MedSLT'/><title type='text'>Building help resources from the combined interlingua corpus (2)</title><content type='html'>Considerable progress on this task today:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;I've added French to the AFF interlingua corpora, including the New York material as requested by Pierrette. The corpora are remade and checked in.&lt;/li&gt;&lt;li&gt;The help resources for Eng and Ara (the languages where we have help class definitions) are now made from the combined interlingua corpus. A separate help file is made for each of the six pairs EngAra, EngFre, EngJap, AraEng, AraFre, AraJap, reflecting the different levels of coverage. You can make the help resources for all of these pairs by doing 'make help_resources' in $MED_SLT2 (i.e. at the top level in the MedSLT directory), and it only takes a few minutes.&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-3358322604276685609?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/3358322604276685609/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=3358322604276685609' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/3358322604276685609'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/3358322604276685609'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/05/building-help-resources-from-combined_27.html' title='Building help resources from the combined interlingua corpus (2)'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-6270208992955031034</id><published>2008-05-27T09:57:00.000-07:00</published><updated>2008-05-27T10:01:24.264-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='help'/><category scheme='http://www.blogger.com/atom/ns#' term='interlingua'/><category scheme='http://www.blogger.com/atom/ns#' term='MedSLT'/><title type='text'>Building help resources from the combined interlingua corpus</title><content type='html'>I've just checked in code that allows us to build Prolog help resources from the combined interlingua corpus in multi-lingual translation applications. This will make it much easier to integrate construction of help resources into the MedSLT build - it should now be almost trivial.&lt;br /&gt;&lt;br /&gt;I'm currently remaking the interlingua corpus (I have had to change the format a little),  and should be able to check in all the relevant MedSLT stuff later this evening.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-6270208992955031034?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/6270208992955031034/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=6270208992955031034' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/6270208992955031034'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/6270208992955031034'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/05/building-help-resources-from-combined.html' title='Building help resources from the combined interlingua corpus'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-2056890051412921732</id><published>2008-05-27T06:25:00.000-07:00</published><updated>2008-05-31T05:13:11.131-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='interlingua'/><title type='text'>Flagging ambiguity in interlingua checking</title><content type='html'>I've just checked in code to catch cases where interlingua is&lt;br /&gt;ambiguous, in the sense of generating multiple different surface&lt;br /&gt;strings in the interlingua grammar. This is most likely to occur in&lt;br /&gt;AFF, when the to-interlingua rules are underconstrained and the&lt;br /&gt;interlingua is only partially instantiated. The following Japanese&lt;br /&gt;-&gt; Interlingua example in MedSLT illustrates:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&gt;&gt; doko ga itami masu ka&lt;br /&gt;&lt;br /&gt;Source: doko ga itami masu ka&lt;br /&gt;Target: WH-QUESTION pain be where PRESENT ACTIVE&lt;br /&gt;Other info:&lt;br /&gt;n_parses = 1&lt;br /&gt;parse_time = 0.297&lt;br /&gt;source_representation = [null=[path_proc,itamu], null=[tense,present],&lt;br /&gt;                      null=[utterance_type,question], subject=[body_part,doko]]&lt;br /&gt;source_discourse = [null=[utterance_type,question], subject=[body_part,doko],&lt;br /&gt;                 null=[tense,present], null=[path_proc,itamu]]&lt;br /&gt;resolved_source_discourse = [null=[utterance_type,question], subject=[body_part,doko],&lt;br /&gt;                          null=[tense,present], null=[path_proc,itamu]]&lt;br /&gt;resolution_processing = trivial&lt;br /&gt;interlingua = [loc=[loc,where], arg1=[secondary_symptom,pain], null=[tense,present],&lt;br /&gt;            null=[utterance_type,whq], null=[verb,be], null=[voice,active]]&lt;br /&gt;interlingua_surface = WH-QUESTION pain be where PRESENT ACTIVE&lt;br /&gt;other_interlingua_surface = [WH-QUESTION pain be above-loc where PRESENT ACTIVE,&lt;br /&gt;                          WH-QUESTION pain be around-loc where PRESENT ACTIVE,&lt;br /&gt;                          WH-QUESTION pain be between-loc where PRESENT ACTIVE,&lt;br /&gt;                          WH-QUESTION pain be in-loc where PRESENT ACTIVE,&lt;br /&gt;                          WH-QUESTION pain be under-loc where PRESENT ACTIVE]&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-2056890051412921732?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/2056890051412921732/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=2056890051412921732' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/2056890051412921732'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/2056890051412921732'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/05/flagging-ambiguity-in-interlingua_27.html' title='Flagging ambiguity in interlingua checking'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-7624245674255753859</id><published>2008-05-27T04:14:00.001-07:00</published><updated>2008-05-27T04:19:14.193-07:00</updated><title type='text'>Background</title><content type='html'>If you've reached this blog and don't have any idea what it's about, Regulus is an Open Source platform for constructing speech-enabled systems, which we've been developing since 2001. We've now built several high-profile applications, including &lt;a href="http://ti.arc.nasa.gov/projects/clarissa/"&gt;Clarissa&lt;/a&gt;, so far the only speech-enabled system to have flown in space, and &lt;a href="http://www.issco.unige.ch/projects/medslt/"&gt;MedSLT&lt;/a&gt;, a medical speech translator. You can read more about Regulus &lt;a href="http://www.issco.unige.ch/projects/regulus/"&gt;here&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-7624245674255753859?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/7624245674255753859/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=7624245674255753859' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/7624245674255753859'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/7624245674255753859'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/05/background.html' title='Background'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3502837936335259853.post-8029598290949362101</id><published>2008-05-27T03:56:00.000-07:00</published><updated>2008-05-27T03:59:56.828-07:00</updated><title type='text'>First entry</title><content type='html'>Rather than mail people about new Regulus features, fixes, etc, I am starting a blog. Don't know why I didn't do this earlier!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3502837936335259853-8029598290949362101?l=regulusnews.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://regulusnews.blogspot.com/feeds/8029598290949362101/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3502837936335259853&amp;postID=8029598290949362101' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/8029598290949362101'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3502837936335259853/posts/default/8029598290949362101'/><link rel='alternate' type='text/html' href='http://regulusnews.blogspot.com/2008/05/first-entry.html' title='First entry'/><author><name>Manny</name><uri>http://www.blogger.com/profile/02841804916537846612</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry></feed>
